ElasticSearch: Using output of one query as input to another ElasticSearch: Using output of one query as input to another elasticsearch elasticsearch

ElasticSearch: Using output of one query as input to another


I would like to use this opportunity to advertise different approach to the given problem. In fact, ElasticSearch: The Definitive Guide does pretty good job on its own, I just have to quote it:

Four common techniques are used to manage relational data in Elasticsearch:

  • Application-side joins
  • Data denormalization
  • Nested objects
  • Parent/child relationships

Often the final solution will require a mixture of a few of these techniques.

Data denormalization in practice means that data gets stored in a way that one single query performs the trick that you would do before with 2 consecutive queries.

Here I will unfold the example from the aforementioned book. Suppose you have two following indices, and you wish to find all blog posts written by any person named John:

PUT /my_index/user/1{  "name":     "John Smith",  "email":    "john@smith.com",  "dob":      "1970/10/24"}PUT /my_index/blogpost/2{  "title":    "Relationships",  "body":     "It's complicated...",  "userID":     1}

There is no other option but to first fetch the IDs of all Johns in the database. What you could do instead is to move some of the user information on the blogpost object:

PUT /my_index/user/1{  "name":     "John Smith",  "email":    "john@smith.com",  "dob":      "1970/10/24"}PUT /my_index/blogpost/2{  "title":    "Relationships",  "body":     "It's complicated...",  "user":     {    "id":       1,    "name":     "John Smith"   }}

Hence enabling search on user.name of the index blogpost.

Apart from traditional ElasticSearch methods you may also consider using third-party plugins like Siren Join:

This join is used to filter one document set based on a second document set, hence its name. It is equivalent to the EXISTS() operator in SQL.