Elasticsearch distinct filter values

This is a job for a terms aggregation (documentation).

You can have the distinct departments values like this :

POST company/employee/_search{  "size":0,  "aggs": {    "by_departments": {      "terms": {        "field": "departments.name",        "size": 0 //see note 1      }    }  }}

Which, in your example, outputs :

{   ...   "aggregations": {      "by_departments": {         "buckets": [            {               "key": "management", //see note 2               "doc_count": 2            },            {               "key": "accounts",               "doc_count": 1            },            {               "key": "it",               "doc_count": 1            }         ]      }   }}

Two additional notes :

setting size to 0 will set the maximum buckets number to Integer.MAX_VALUE. Don't use it if there are too many departments distinct values.
you can see that the keys are terms resulting of analyzing departments values. Be sure to use your terms aggregation on a field mapped as not_analyzed .

For example, with our default mapping (departments.name is an analyzed string), adding this employee:

{  "name": "Bill Gates",  "departments": [    {      "name": "IT"    },    {      "name": "Human Resource"    }  ]}

will cause this kind of result:

{   ...   "aggregations": {      "by_departments": {         "buckets": [            {               "key": "it",               "doc_count": 2            },            {               "key": "management",               "doc_count": 2            },            {               "key": "accounts",               "doc_count": 1            },            {               "key": "human",               "doc_count": 1            },            {               "key": "resource",               "doc_count": 1            }         ]      }   }}

With a correct mapping :

POST company{  "mappings": {    "employee": {      "properties": {        "name": {          "type": "string"        },        "departments": {          "type": "object",          "properties": {            "name": {              "type": "string",              "index": "not_analyzed"            }          }        }      }    }  }}

The same request ends up outputting :

{   ...   "aggregations": {      "by_departments": {         "buckets": [            {               "key": "IT",               "doc_count": 2            },            {               "key": "Management",               "doc_count": 2            },            {               "key": "Accounts",               "doc_count": 1            },            {               "key": "Human Resource",               "doc_count": 1            }         ]      }   }}

Hope this helps!

CodeHunter

Elasticsearch distinct filter values

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last