Is it possible to sort by a range in Elasticsearch?
It is not straight-forward to sort using a field of range data type. Still you can use script based sorting to some extent to get the expected result.
e.g. For simplicity of script I'm assuming for all your docs, the data indexed against my_range
field has data for gt
and lte
only and you want to sort based on the minimum values of the two then you can add the below for sorting:
{ "query": { "bool": { "filter": [ { "match": { "my_value": "hi" } }, { "range": { "my_range": { "gt": 0, "lte": 200 } } } ] } }, "sort": { "_script": { "type": "number", "script": { "lang": "painless", "inline": "Math.min(params['_source']['my_range']['gt'], params['_source']['my_range']['lte'])" }, "order": "asc" } }}
You can modify the script as per your needs for complex data involving combination of all lt
, gt
, lte
, gte
.
Updates (Scripts for other different use cases):
1. Sort by difference"Math.abs(params['_source']['my_range']['gt'] - params['_source']['my_range']['lte'])"
2. Sort by gt
"params['_source']['my_range']['gt']"
3. Sort by lte
"params['_source']['my_range']['lte']"
4. Sorting if query returns few docs which don't have range
field"if(params['_source']['my_range'] != null) { <sorting logic> } else { return 0; }"
Replace <sorting logic>
with the required logic of sorting (which can be one of the 3 above or the one in the query)
return 0
can be replace by return -1
or anything other number as per the sorting needs
I think what you are looking for is sort based on the difference of the range
coz I'm not sure if simply sorting on any of the range values would make any sense.
For e.g. if range for one document is 100, 300
and another 200, 600
then you would want to sort based on the difference for e.g. you would want the lesser range to be appearing i.e 300-100 = 200
to be appearing at the top.
If so, I've made use of the below painless script and implemented script based sorting.
Sorting based on difference in Range
POST <your_index_name>/_search{ "query":{ "match_all":{ } }, "sort":{ "_script":{ "type":"number", "script":{ "lang":"painless", "inline":"params._source.my_range.lte-params._source.my_range.gte" }, "order":"asc" } }}
Note that in this case, sort won't be based on any of the field values of my_range
but only on their differences. If you want to further sort based on the fields like lte
, lt
, gte
or gt
you can have your sort implemented with multiple script as below:
Sorting based on difference in Range + Range Field (my_range.lte)
POST <your_index_name>/_search{ "query":{ "match_all":{ } }, "sort":[ { "_script":{ "type":"number", "script":{ "lang":"painless", "inline":"params._source.my_range.lte - params._source.my_range.gte" }, "order":"asc" } }, { "_script":{ "type":"number", "script":{ "lang":"painless", "inline":"params._source.my_range.lte" }, "order":"asc" } } ]}
So in this case even if for two documents, ranges are same, the one with the lesser my_range.lte
would be showing up first.
Sort based on range field
However if you simply want to sort based on one of the range values, you can make use of below query.
POST <your_index_name>/_search{ "query":{ "match_all":{ } }, "sort":{ "_script":{ "type":"number", "script":{ "lang":"painless", "inline":"params._source.my_range.lte" }, "order":"asc" } }}
Updated Answer to manage documents without range
This is for the scenario, Sort based on difference in range + Range.lte or Range.lt whichever is present
The below code what it does is,
- Checks if the document has
my_range
field - If it doesn't have, then by default it would return
Long.MAX_VALUE
. This would mean if you sort by asc, this document should returnedlast. - Further it would check if document has
lte
orlt
and uses that value ashigh
. Note that default value ofhigh
isLong.MAX_VALUE
. - Similarly it would check if document has
gte
orgt
and uses that value aslow
. Default value oflow
would be0
. - Calculate now
high - low
value on which sorting would be applied.
Updated Query
POST <your_index_name>/_search{ "size":100, "query":{ "match_all":{ } }, "sort":[ { "_script":{ "type":"number", "script":{ "lang":"painless", "inline":""" if(params._source.my_range==null){ return Long.MAX_VALUE; } else { long high = Long.MAX_VALUE; long low = 0L; if(params._source.my_range.lte!=null){ high = params._source.my_range.lte; } else if(params._source.my_range.lt!=null){ high = params._source.my_range.lt; } if(params._source.my_range.gte!=null){ low = params._source.my_range.gte; } else if (params._source.my_range.gt==null){ low = params._source.my_range.gt; } return high - low; } """ }, "order":"asc" } }, { "_script":{ "type":"number", "script":{ "lang":"painless", "inline":""" if(params._source.my_range==null){ return Long.MAX_VALUE; } long high = Long.MAX_VALUE; if(params._source.my_range.lte!=null){ high = params._source.my_range.lte; } else if(params._source.my_range.lt!=null){ high = params._source.my_range.lt; } return high;""" }, "order":"asc" } } ]}
This should work with ES 5.4. Hope it helps!
This can be resolved easily by using the regex interval filter :
Interval The interval option enables the use of numeric ranges, enclosed by angle brackets "<>". For string: "foo80":
foo<1-100> # matchfoo<01-100> # matchfoo<001-100> # no matchEnabled with the INTERVAL or ALL flags.
{ "query": { "bool": { "filter": [ { "match": { "my_value": "hi" } }, { "regexp": { "my_range": { "value": "<0-200>" } } } ] } }, "sort": { "my_range": { "order": "asc", "mode": "min" } }}