Creating DataFrame from ElasticSearch Results Creating DataFrame from ElasticSearch Results python python

Creating DataFrame from ElasticSearch Results


Or you could use the json_normalize function of pandas :

from pandas.io.json import json_normalizedf = json_normalize(res['hits']['hits'])

And then filtering the result dataframe by column names


Better yet, you can use the fantastic pandasticsearch library:

from elasticsearch import Elasticsearches = Elasticsearch('http://localhost:9200')result_dict = es.search(index="recruit", body={"query": {"match_all": {}}})from pandasticsearch import Selectpandas_df = Select.from_dict(result_dict).to_pandas()


There is a nice toy called pd.DataFrame.from_dict that you can use in situation like this:

In [34]:Data = [{u'_id': u'a1XHMhdHQB2uV7oq6dUldg',      u'_index': u'logstash-2014.08.07',      u'_score': 1.0,      u'_type': u'logs',      u'fields': {u'@timestamp': u'2014-08-07T12:36:00.086Z',       u'path': u'app2.log'}},     {u'_id': u'TcBvro_1QMqF4ORC-XlAPQ',      u'_index': u'logstash-2014.08.07',      u'_score': 1.0,      u'_type': u'logs',      u'fields': {u'@timestamp': u'2014-08-07T12:36:00.200Z',       u'path': u'app1.log'}}]In [35]:df = pd.concat(map(pd.DataFrame.from_dict, Data), axis=1)['fields'].TIn [36]:print df.reset_index(drop=True)                 @timestamp      path0  2014-08-07T12:36:00.086Z  app2.log1  2014-08-07T12:36:00.200Z  app1.log

Show it in four steps:

1, Read each item in the list (which is a dictionary) into a DataFrame

2, We can put all the items in the list into a big DataFrame by concat them row-wise, since we will do step#1 for each item, we can use map to do it.

3, Then we access the columns labeled with 'fields'

4, We probably want to rotate the DataFrame 90 degrees (transpose) and reset_index if we want the index to be the default int sequence.

enter image description here