Elastic Search: RemoteTransportException in Paginated search for more than 10000 results Elastic Search: RemoteTransportException in Paginated search for more than 10000 results elasticsearch elasticsearch

Elastic Search: RemoteTransportException in Paginated search for more than 10000 results


This happens because the max result window size of elasticsearch is 10000 by default. Now although you are requesting only hundred(only 10 exist in your case) results ie. from 10000-10010 under the hood elasticsearch has to get all the 10010 result sort them and then discard the 10000 results and then give you the 10 left, and hence the problem of exceeding the max window size.The simplest thing you could do to fix this would be increase this default value of 10000 to a very high value. You could use th following command to do that:

curl -XPUT http://1.2.3.4:9200/index/_settings -d '{ "index" : { "max_result_window" : 1000000}}'

Coming to the scroll api, it does not return paginated results hence the concept of from does not exist and the size parameter is used in a different way.The scroll api will ask each of the shards to give it's top "size" results so if the size is 10 and you have 5 primary shards, elasticsearch will return you 50 results.Now every request to the scroll api will generate a scroll id which you will need to pass to the next query to get the next "page" of result. And since you are not doing that you keep getting the same results.You should read more about the implementation of the scroll api here.

But then I also want to jump directly on some page eg: I am on page 1 and want to move on page 5

Also since there is no pagination in scroll api you can't simply jump between non consecutive pages.

Now you have to also keep in mind that for doing the scroll elasticsearch takes a snapshot in time of the index, so if you do any changes to the index during the time you keep the scroll context open, these changes won't be reflected in the results.


I won't suggest increasing max_result_window. The limit is there for a reason and I think we should avoid tampering with it.

Let's take an example where you run a wildcard query returning more than 20 million matches (which I have seen in my data, our index has more than 1 billion records with primary store size greater than 5 TB), the user asks for the last page that is the 20th million record. Increasing the result window will avoid the exception but will try to load all 20 million records in the heap and which will cause Out Of Memory crashing your whole server, which I guess will be very bad.

I will suggest you should use Search After (https://www.elastic.co/guide/en/elasticsearch/reference/5.1/search-request-search-after.html) if scroll is not an option. But Search after have its own limitations which should be taken into consideration.