ElasticSearch Join Filter: Using subquery results as filter input possible?
Here's a link to a runnable example:
http://sense.qbox.io/gist/9da6a30fc12c36f90ae39111a08df283b56ec03c
It presumes documents that look like:
{ "transaction_type" : "some_transaction", "user_base" : "some_user_base_id" }
The query is set to return no results, since aggregations take care of computing the stats you're looking for:
{ "size" : 0, "query" : { "match_all" : {} }, "aggs" : { "distinct_transactions" : { "terms" : { "field" : "transaction_type", "size" : 20 }, "aggs" : { "by_user_base" : { "terms" : { "field" : "user_base", "size" : 20 } } } } }}
And here's what the result looks like:
"aggregations": { "distinct_transactions": { "buckets": [ { "key": "subscribe", "doc_count": 4, "by_user_base": { "buckets": [ { "key": "2", "doc_count": 3 }, { "key": "1", "doc_count": 1 } ] } }, { "key": "purchase", "doc_count": 3, "by_user_base": { "buckets": [ { "key": "1", "doc_count": 2 }, { "key": "2", "doc_count": 1 } ] } } ] } }
So, inside of "aggregations", you'll have a list of "distinct_transactions". The key will be the transaction type, and the doc_count will represent the total transactions by all users.
Inside of each "distinct_transaction", there's "by_user_base", which is another terms agg (nested). Just like the transactions, the key will represent the user base name (or ID or whatever) and the doc_count will represent that unique user base's # of transactions.
Is that what you were looking to do? Hope I helped.
With the current version of ElasticSerach, there's the new significant_terms
aggregation type, which can be used to calculate the affinity scores for my use case in a more simple way.
All the to me relevant metrics can then be calculated in one step, which is very nice!