Elasticsearch - combine fields from multiple documents
You can use Scripted Metric Aggregation with a reduce_script.
Setup some test data:
curl -XPUT http://localhost:9200/testing/foo/1 -d '{ "foo" : [1, 2, 3] }'curl -XPUT http://localhost:9200/testing/foo/2 -d '{ "foo" : [4, 5, 6] }'
Now try this aggregation:
curl -XGET "http://localhost:9200/testing/foo/_search" -d'{ "size": 0, "aggs": { "fooreduced": { "scripted_metric": { "init_script": "_agg[\"result\"] = []", "map_script": "_agg.result.add(doc[\"foo\"].values)", "reduce_script": "reduced = []; for (a in _aggs) { for (entry in a) { word = entry.key; reduced += entry.value } }; return reduced.flatten().sort()" } } }}'
The call will return this:
{ "took": 50, "timed_out": false, "_shards": { "total": 6, "successful": 6, "failed": 0 }, "hits": { "total": 2, "max_score": 0, "hits": [] }, "aggregations": { "fooreduced": { "value": [ 1, 2, 3, 4, 5, 6 ] } }}
It might be possible that there is a solution withoun .flatten()
, but I'm not that much into groovy (yet) to find such a solution. And I can't say how good the performance of this aggregation is, you have to test it for yourself.