Difference(s) between Solr's Cursor and ElasticSearch's Scroll Difference(s) between Solr's Cursor and ElasticSearch's Scroll elasticsearch elasticsearch

Difference(s) between Solr's Cursor and ElasticSearch's Scroll


Solr's cursor and start both function like open-ended range queries, with cursor operating like a less-than range query on score and start operating like a greater-than range query on rank. cursor is faster (especially for deep pagination) because, for a page size of 10, it only needs to hold in memory and sort at most the top 10 results, whereas start=N must hold in memory and sort the top N + 10 results, where N increases by 10 for each subsequent page. Both are sensitive to index modifications during pagination because each query runs against the current state of the index.

Elasticsearch's scroll functions like a single-use forward-only linear scan through a snapshot of the results of a fixed query which is guaranteed to return each document exactly once. It is not affected by index modifications because Elasticsearch remembers all the documents associated with the index at the time the "scroll context" was created by preserving the containing immutable segment files while the scroll context is alive. To avoid accumulating a stockpile of old segment files referred to by scroll contexts that will never be used again (perhaps because the client crashed), scroll contexts expire after a specified duration of time. My guess is that Elasticsearch supports neither jumping to arbitrary pages nor altering the query in order to optimize for scrolling efficiency.

You can partially emulate the behavior of Solr's cursor in Elasticsearch using an open-ended range query in which the upper/lower bound is set to the last value of the previous batch of results.