How to store relational data in elasticsearch How to store relational data in elasticsearch elasticsearch elasticsearch

How to store relational data in elasticsearch


Nested Object is a perfect approach for it. There will be no repetition of child objects in parent document if you update the child objects correctly.I'm using the same approach for one of my use case where I need to maintain relational data of Master-Child One-to-Many relationship. I've written a Painless script for Update API to Add & Update existing nested child objects inside parent document without creating duplicates or repetitive entries.

Updated Answer:

Below is the structure of Parent-Child Nested Type document with embedded nested type documents "childs".

{    "parent_id": 1,    "parent_name": "ABC",    "parent_number": 123,    "parent_addr": "123 6th St. Melbourne, FL 32904"    "childs": [      {        "child_id": 1,        "child_name": "PQR",        "child_number": 456,        "child_age": 10      },      {        "child_id": 2,        "child_name": "XYZ",        "child_number": 789,        "child_age": 12      },      {        "child_id": 3,        "child_name": "QWE",        "child_number": 234,        "child_age": 16      }    ]   }

Mapping would be as below:

PUT parent/{  "parent": {    "mappings": {      "parent": {        "properties": {          "parent_id": {            "type": "long"          },          "parent_name": {            "type": "text",            "fields": {              "keyword": {                "type": "keyword",                "ignore_above": 256              }            }          },          "parent_number": {            "type": "long"          },          "parent_addr": {            "type": "text",            "fields": {              "keyword": {                "type": "keyword",                "ignore_above": 256              }            }          },          "child_tickets": {            "type": "nested",            "properties": {              "child_id": {                "type": "long"              },              "child_name": {                "type": "text",                "fields": {                  "keyword": {                    "type": "keyword",                    "ignore_above": 256                  }                }              },              "child_number": {                "type": "long"              },              "child_age": {                "type": "long"              }            }          }        }      }    }  }}

In RDMS, these both entities(parent, child) are two different tables with One to Many relation between Parent -> Child.Parent's id is foreign key for Child's row. (id is must for both tables)

Now in Elasticsearch, to index the parent document we must have id to index it, in this case it is parent_id.Index Parent document Query(parent_id is the id which i was talking about and have index the document with id(_id) = 1):

POST parent/parent/1{    "parent_id": 1,    "parent_name": "ABC",    "parent_number": 123,    "parent_addr": "123 6th St. Melbourne, FL 32904"}

Now, adding child(s) to the parent. For that you will require child document which should have child id plus parent id. To add a child, parent id is must. Below is the update query to add new childs or update already present childs.

POST parent/parent/1/_update{    "script":{    "lang":"painless",    "inline":"if (!ctx._source.containsKey(\"childs\")) {                ctx._source.childs = [];                ctx._source.childs.add(params.child);            } else {                int flag=0;                for(int i=0;i<ctx._source.childs.size();i++){                    if(ctx._source.childs[i].child_id==params.child.child_id){                        ctx._source.childs[i]=params.child;                        flag++;                    }                }                if(flag==0){                    ctx._source.childs.add(params.child);                }            }",    "params":{        "child":{                "child_id": 1,                "child_name": "PQR",                "child_number": 456,                "child_age": 10            }        }    }}

Give it a shot. Cheers!

Let me know if you need anything else.


There are two more approaches: Denormalization and running multiple queries for joins.

Denormalization will eat up some more space and increase your write time, but you will just need to run one query to retrieve your data, hence, your read time will improve. Since you don't want to store data in a single index, so joining might help you out.