How to improvise on heavy should queries in huge data set?

elasticsearch runtime query-builder

Given the fact that you are retrieving the IDs of the documents, I can assuming that you are not executing a query and rather a scan and retrieving all the documents which satisfy your query.

Now, the first query is an intersection as compared to the second which is a union.Given the fact that these words appear in 5874, 270419 and 397829 docs, the intersection is of length 5874 at max whereas the union is of length 397829. These are the number of documents that your ES cluster will be returning for the two cases.

The drastic difference for the time taken between the two cases is because of the number of documents that are to be returned. For scanning, you must be performing pagination (via scroll) and repeating in a loop. And that will take time if the number of document increases.

If you just execute a query with some size limit instead of scanning, then it is likely to get finished for nearly the same time for both the cases.

CodeHunter

How to improvise on heavy should queries in huge data set?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last