Elasticsearch: Possible to process aggregation results?
A bit more complicated, but here it goes (only in 1.4 because of this type of aggregation):
{ "query": { "filtered": { "query": { "match_all": {} }, "filter": { "term": { "serviceId": 1 } } } }, "aggs": { "executionTimes": { "scripted_metric": { "init_script": "_agg['values'] = new java.util.HashMap();", "map_script": "if (_agg.values[doc['callerId'].value]==null) {_agg.values[doc['callerId'].value]=doc['duration'].value;} else {_agg.values[doc['callerId'].value].add(doc['duration'].value);}", "combine_script":"someHashMap = new java.util.HashMap();for(x in _agg.values.keySet()) {value=_agg.values[x]; sum=0; for(y in value) {sum+=y}; someHashMap.put(x,sum)}; return someHashMap;", "reduce_script": "finalArray = []; finalMap = new java.util.HashMap(); for(map in _aggs){for(x in map.keySet()){if(finalMap.containsKey(x)){value=finalMap.get(x);finalMap.put(x,value+map.get(x));} else {finalMap.put(x,map.get(x))}}}; finalAvgValue=0; finalMaxValue=-1; finalMinValue=-1; for(key in finalMap.keySet()){currentValue=finalMap.get(key);finalAvgValue+=currentValue; if(finalMinValue<0){finalMinValue=currentValue} else if(finalMinValue>currentValue){finalMinValue=currentValue}; if(currentValue>finalMaxValue) {finalMaxValue=currentValue}}; finalArray.add(finalMaxValue); finalArray.add(finalMinValue); finalArray.add(finalAvgValue/finalMap.size()); return finalArray", "lang": "groovy" } } }}
Also, I'm not saying it's the best approach, but only one I could find. Also, I'm not saying that the solution is in its best form. Probably, it may be cleaned up and improved. I wanted to show, though, that it is possible. Keep in mind, though, it's available in 1.4.
The basic idea of the approach is to use the scripts to build a data structure that should hold the information you need, computed in different steps according to scripted metric aggregation. Also, the aggregation is performed for only one serviceId
. If you want to do this for all serviceIds I think you might want to re-think a bit the data structure in the scripts.
For the query above and for the exact data you provided the output is this:
{ "took": 3, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 6, "max_score": 0, "hits": [] }, "aggregations": { "executionTimes": { "value": [ 1202, 1033, "1117.5" ] } }}
The order of values in the array value
is [max, min, avg], as per the script in reduce_script
.
There will be a new feature in upcoming version 2.0.0 called "Reducers". Reducers will allow you to calculate aggregations over aggregations.
Related Post:https://github.com/elasticsearch/elasticsearch/issues/8110