ES: Bucket agg + top_hits + scroll? How to return all hits (more than `size+from` max) in buckets? ES: Bucket agg + top_hits + scroll? How to return all hits (more than `size+from` max) in buckets? elasticsearch elasticsearch

ES: Bucket agg + top_hits + scroll? How to return all hits (more than `size+from` max) in buckets?


I believe your use case isn't supported. Aggregations specifically "throw out" the other information in documents. Top hits is just meant to return the most relevant hits in each bucket that match your query. This is more of a scoring feature than a document retrieval feature, i.e. top hits agg isn't meant to return all the documents in a bucket.

If you need all the documents anyway, why don't you aggregate the results yourself? This is your option #2 and it seems like the best option to me.

The SO post you referenced describes a workaround for paging in an aggregation by using the exclude value filter in terms aggregations. It doesn't use the scroll api. I also don't think it helps you.

Lastly, Elasticsearch terms aggregations often have errors due to shard sizing. If you need the documents anyway, you can get completely accurate aggregations by performing the bucketing in your application - you'll have to visit every document, which might be slower than what ES can do, but you're also getting a different result.

If you have more details on your use case, perhaps one of us can give better advice. Such as, why do you need all the documents and also the bucket counts?