Simulating a join in ElasticSearch
You can do it like this:
- first you have a query that filters your documents with actions
1
and2
only (I have no idea if you can have other action types) - then the magic is with aggregations
- the first aggregation is a
terms
one foruser_id
, so that you can do individual calculations per user - then you use a
cardinality
sub-aggregation to count the number of distinct actions per user. Since the query is for actions1
and2
that number can only be 1 or 2 - then you use a
bucket_selector
sub-aggregation to only keep those users that have the cardinality result of2
.
- the first aggregation is a
{ "size": 0, "query": { "bool": { "should": [ { "terms": { "action_id": [ 1, 2 ] } } ] } }, "aggs": { "users": { "terms": { "field": "user_id", "size": 10 }, "aggs": { "actions": { "cardinality": { "field": "action_id" } }, "actions_count_bucket_filter": { "bucket_selector": { "buckets_path": { "totalActions": "actions" }, "script": "totalActions >= 2" } } } } }}
The result will look like this:
"aggregations": { "users": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": 1, "doc_count": 2, "actions": { "value": 2 } }, { "key": 5, "doc_count": 2, "actions": { "value": 2 } } ] } }
The key
s are the user_ids whose actions are 1
and 2
. bucket_selector
aggregation is available in 2.x+ version of ES.