Elasticsearch: Possible to process aggregation results? Elasticsearch: Possible to process aggregation results? elasticsearch elasticsearch

Elasticsearch: Possible to process aggregation results?


A bit more complicated, but here it goes (only in 1.4 because of this type of aggregation):

{  "query": {    "filtered": {      "query": {        "match_all": {}      },      "filter": {        "term": {          "serviceId": 1        }      }    }  },  "aggs": {    "executionTimes": {      "scripted_metric": {        "init_script": "_agg['values'] = new java.util.HashMap();",        "map_script": "if (_agg.values[doc['callerId'].value]==null) {_agg.values[doc['callerId'].value]=doc['duration'].value;} else {_agg.values[doc['callerId'].value].add(doc['duration'].value);}",        "combine_script":"someHashMap = new java.util.HashMap();for(x in _agg.values.keySet()) {value=_agg.values[x]; sum=0; for(y in value) {sum+=y}; someHashMap.put(x,sum)}; return someHashMap;",        "reduce_script": "finalArray = []; finalMap = new java.util.HashMap(); for(map in _aggs){for(x in map.keySet()){if(finalMap.containsKey(x)){value=finalMap.get(x);finalMap.put(x,value+map.get(x));} else {finalMap.put(x,map.get(x))}}}; finalAvgValue=0; finalMaxValue=-1; finalMinValue=-1; for(key in finalMap.keySet()){currentValue=finalMap.get(key);finalAvgValue+=currentValue; if(finalMinValue<0){finalMinValue=currentValue} else if(finalMinValue>currentValue){finalMinValue=currentValue}; if(currentValue>finalMaxValue) {finalMaxValue=currentValue}}; finalArray.add(finalMaxValue); finalArray.add(finalMinValue); finalArray.add(finalAvgValue/finalMap.size()); return finalArray",        "lang": "groovy"      }    }  }}

Also, I'm not saying it's the best approach, but only one I could find. Also, I'm not saying that the solution is in its best form. Probably, it may be cleaned up and improved. I wanted to show, though, that it is possible. Keep in mind, though, it's available in 1.4.

The basic idea of the approach is to use the scripts to build a data structure that should hold the information you need, computed in different steps according to scripted metric aggregation. Also, the aggregation is performed for only one serviceId. If you want to do this for all serviceIds I think you might want to re-think a bit the data structure in the scripts.

For the query above and for the exact data you provided the output is this:

{   "took": 3,   "timed_out": false,   "_shards": {      "total": 5,      "successful": 5,      "failed": 0   },   "hits": {      "total": 6,      "max_score": 0,      "hits": []   },   "aggregations": {      "executionTimes": {         "value": [            1202,            1033,            "1117.5"         ]      }   }}

The order of values in the array value is [max, min, avg], as per the script in reduce_script.


There will be a new feature in upcoming version 2.0.0 called "Reducers". Reducers will allow you to calculate aggregations over aggregations.

Related Post:https://github.com/elasticsearch/elasticsearch/issues/8110