Diversified results on Elasticsearch search
You can couple the sampler with the top_hits
aggregation to get diversified results.
{ "query": { "match": { "query": "iphone" } }, "size":0, "aggs": { "sample": { "sampler": { "shard_size": 200, "field" : "user.id" }, "aggs": { "diversifiedMatches": { "top_hits": { "size":10 } } } } }}
There are some caveats e.g:
1) Deduplication is per-shard not global
2) Choice of diversification field must be a single-value field
3) No support for pagination
4) No support for sorting on anything other than score
Addressing the above issues would be hard and would require expensive/complex co-ordination internally plus more guidance from the client about when and where "duplicate" results can be re-introduced (page 2? page 3? how many?) etc.