Count child pages in Elasticsearch
If you do not have many items:
You can retrieve the information using only one query:
GET /test/page/_search{ "filter": { "term": { "Parent": "0" } }, "aggs": { "numberOfChildren": { "terms": { "field": "Parent", "size": 0 } } }}
In the response, hits.hits
will contains the children of 0
.
For each node you will have its number of children in aggregations.numberOfChildren.buckets
with this structure:
{ "key": [page id], "doc_count": [number of children for this page]}
Example response:
{ ... "hits": { "total": 1, "max_score": 1, "hits": [ { "_index": "test", "_type": "page", "_id": "a", "_score": 1, "_source": { "Id": "a", "Parent": "0" } } ] }, "aggregations": { "numberOfChildren": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "0", "doc_count": 1 }, { "key": "a", "doc_count": 2 }, { "key": "c", "doc_count": 1 } ] } }
Please not that:
- If the page does not have any hildren, it is not in the list.
- You have the number of children for all parents, not only the directchilds of
0
, thus it will break if you have many items (too manybuckets).
If you have many items:
Easiest way is to use two queries:GET /test/page/_search{ "query": { "filtered": { "filter": { "term": { "Parent": "0" } } } }}
You will have 0's direct children in hits.hits
.
GET /test/page/_search{ "size": 0, "query": { "filtered": { "filter": { "terms": { "Parent": [ "a" // list 0's direct children ids ] } } } }, "aggs": { "numberOfChildren": { "terms": { "field": "Parent", "size": 0, "order": { "_term": "asc" } } } }}
You will have the number of children of 0's direct children in aggregations.numberOfChildrens.buckets
You may also be able to use scripts, but I am not sure that they can work in this situation.
Parent-child relationship will not help you since parents and children cannot be of the same type.
I don't know if this is what you are searching for:
I've tried to insert same elements:
PUT /tmp_index/doc/1{ "id": "a", "parent": "0"}PUT /tmp_index/doc/2{ "id": "b", "parent": "a"}PUT /tmp_index/doc/3{ "id": "c", "parent": "a"}PUT /tmp_index/doc/4{ "id": "d", "parent": "c"}
With a nested aggregation like this:
POST /tmp_index/_search?pretty{ "size": 0, "query": { "match_all": {} }, "aggs": { "group_by_first": { "terms": { "field": "parent", "size" : 0 }, "aggs": { "group_by_second": { "terms": { "field": "id", "size" : 0 } } } } }}
You get this result:
{ "key": "a", "doc_count": 2, "group_by_second": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "b", "doc_count": 1 }, { "key": "c", "doc_count": 1 } ] } }, { "key": "0", "doc_count": 1, "group_by_second": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "a", "doc_count": 1 } ] } }, { "key": "c", "doc_count": 1, "group_by_second": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "d", "doc_count": 1 } ] } }
[Updated]
I'm not sure if you can achieve it all in one query.
I needed something similar and ended up using msearch
for the follow-up query.
POST /test/_msearch{}{"query" : {"term" : {"Parent": "c"}}}, "size" : 0}{}{"query" : {"term" : {"Parent": "d"}}}, "size" : 0}