Diversified results on Elasticsearch search Diversified results on Elasticsearch search elasticsearch elasticsearch

Diversified results on Elasticsearch search


You can couple the sampler with the top_hits aggregation to get diversified results.

{    "query": {        "match": {            "query": "iphone"        }    },    "size":0,    "aggs": {        "sample": {            "sampler": {                "shard_size": 200,                "field" : "user.id"                            },            "aggs": {                "diversifiedMatches": {                    "top_hits": {                        "size":10                    }                }            }        }    }}

There are some caveats e.g:

1) Deduplication is per-shard not global

2) Choice of diversification field must be a single-value field

3) No support for pagination

4) No support for sorting on anything other than score

Addressing the above issues would be hard and would require expensive/complex co-ordination internally plus more guidance from the client about when and where "duplicate" results can be re-introduced (page 2? page 3? how many?) etc.