how to return the count of unique documents by using elasticsearch aggregation
I think you need a reverse_nested
aggregation, because you want aggregation based on a nested value, but actually counting the ROOT documents, not the nested ones
{ "query": { "bool": { "must": [ { "term": { "last_name": "smith" } } ] } }, "aggs": { "location": { "nested": { "path": "location" }, "aggs": { "state": { "terms": { "field": "location.state", "size": 10 }, "aggs": { "top_reverse_nested": { "reverse_nested": {} } } } } } }}
And, as a result, you would see something like this:
"aggregations": { "location": { "doc_count": 6, "state": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "ny", "doc_count": 4, "top_reverse_nested": { "doc_count": 2 } }, { "key": "ca", "doc_count": 2, "top_reverse_nested": { "doc_count": 2 } } ] } } }
And what you are looking for is under top_reverse_nested
part.One point here: if I'm not mistaking "doc_count": 6
is the NESTED document count, so don't be confused about these numbers thinking you are counting root documents, the count is on the nested ones. So, for a document with three nested ones that match, the count would be 3, not 1.