ElasticSearch, return results, dedupped and with custom sort bubbling to top ElasticSearch, return results, dedupped and with custom sort bubbling to top elasticsearch elasticsearch

ElasticSearch, return results, dedupped and with custom sort bubbling to top


Try this approach:

  • your mapping should, also, store a not_analyzed version of your title, so that the buckets will be built based on the full title, not on individual terms forming the title:
{  "mappings": {    "engineers": {      "properties": {        "title": {          "type": "string",          "fields":{            "raw": {              "type": "string",              "index": "not_analyzed"            }          }        },        "content": {          "type": "string"        },        "weighted_importance": {          "type": "integer"        }      }    }  }}
  • group the results on buckets built on title.raw defined above
  • define a top_hits sub-aggregation to bring back the "best" document for each bucket
  • define another sub-aggregation on the same level as the top_hits one that should be a max aggregation that will compute the maximum weighted_importance
  • in the main aggregation use the max above to sort the resulting buckets
GET /my_index/engineers/_search?search_type=count{  "query": {    "match": {      "title": "Engineer"    }  },  "aggs": {    "title": {      "terms": {        "field": "title.raw",        "order": {"best_hit":"desc"}      },      "aggs": {        "first_match": {          "top_hits": {            "sort": [{"weighted_importance": {"order": "desc"}}],            "size": 1          }        },        "best_hit": {          "max": {            "lang": "groovy",             "script": "doc['weighted_importance'].value"          }        }      }    }  }}