ElasticSearch multi level parent-child aggregation

You don't need aggregations to do this:

These are the sort criteria:

Distance ASC (company.location)
Rating DESC (company.rating_value)
Soonest Future Availability ASC (company.employee.availability.start)

If you ignore #3, then you can run a relatively simple company query like this:

GET /companies/company/_search{ "query": { "match_all" : {} }, "sort": {    "_script": {        "params": {            "lat": 51.5186,            "lon": -0.1347        },        "lang": "groovy",        "type": "number",        "order": "asc",        "script": "doc['location'].distanceInMiles(lat,lon)"    },    "rating_value": { "order": "desc" }  }}

#3 is tricky because you need to reach down and find the availability ( company > employee > availability ) for each company closest to the time of the request and use that duration as a third sort criterion.

We're going to use a function_score query at the grandchild level to take the time difference between the request time and each availability in the hit _score. (Then we'll use the _score as the third sort criterion).

To reach the grandchildren we need to use a has_child query inside a has_child query.

For each company we want the soonest available Employee (and of course their closest Availability). Elasticsearch 2.0 will give us a "score_mode": "min" for cases like this, but for now, since we're limited to "score_mode": "max" we'll make the grandchild _score be the reciprocal of the time-difference.

          "function_score": {            "filter": {               "range": {                 "start": {                  "gt": "2014-12-22T10:34:18+01:00"                }               }            },            "functions": [              {                "script_score": {                  "lang": "groovy",                  "params": {                      "requested": "2014-12-22T10:34:18+01:00",                      "millisPerHour": 3600000                   },                  "script": "1 / ((doc['availability.start'].value - new DateTime(requested).getMillis()) / millisPerHour)"                }              }            ]          }

So now the _score for each grandchild (Availability) will be 1 / number-of-hours-until-available (so that we can use the maximum reciprocal time until available per Employee, and the maximum reciprocal(ly?) available Employee per Company).

Putting it all together, we continue to query company but use company > employee > availabilty to generate the _score to use as the #3 sort criterion:

GET /companies/company/_search{ "query": {     "has_child" : {        "type" : "employee",        "score_mode" : "max",        "query": {          "has_child" : {            "type" : "availability",            "score_mode" : "max",            "query": {              "function_score": {                "filter": {                   "range": {                     "start": {                      "gt": "2014-12-22T10:34:18+01:00"                    }                   }                },                "functions": [                  {                    "script_score": {                      "lang": "groovy",                      "params": {                          "requested": "2014-12-22T10:34:18+01:00",                          "millisPerHour": 3600000                       },                      "script": "1/((doc['availability.start'].value - new DateTime(requested).getMillis()) / millisPerHour)"                    }                  }                ]              }            }          }        }    } }, "sort": {  "_script": {    "params": {        "lat": 51.5186,        "lon": -0.1347    },    "lang": "groovy",    "type": "number",    "order": "asc",    "script": "doc['location'].distanceInMiles(lat,lon)"  },  "rating_value": { "order": "desc" },  "_score": { "order": "asc" } }}

sorting elasticsearch aggregation

You should check out R-Tree data structure https://en.wikipedia.org/wiki/R-tree.

CodeHunter

ElasticSearch multi level parent-child aggregation

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last