How to implement an exact match in a filter with elasticsearch? How to implement an exact match in a filter with elasticsearch? elasticsearch elasticsearch

How to implement an exact match in a filter with elasticsearch?


Update : As OP mentioned in the comment that he is using 2.4, I am updating my solution to include the solution which works for it.

ES 2.4 solution

Index creation with required settings and mappings

{    "settings": {        "analysis": {            "analyzer": {                "lckeyword": {                    "filter": [                        "lowercase"                    ],                    "tokenizer": "keyword"                }            }        }    },    "mappings": {        "so": {            "properties": {                "state": {                    "type": "string"                },                "city": {                    "type": "string"                },                "colony": {                    "type": "string"                },                "state_raw": {                    "type": "string",                    "analyzer": "lckeyword"                }            }        }    }}

Search query

{    "query": {        "filtered": {            "query": {                "bool": {                    "should": [                        {                            "match": {                                "state": {                                    "query": "michoacán de ocampo"                                }                            }                        },                        {                            "match": {                                "colony": {                                    "query": "zamora"                                }                            }                        },                        {                            "match": {                                "city": {                                    "query": "zamora"                                }                            }                        }                    ]                }            },            "filter": {                "term": {                    "state_raw": "michoacán de ocampo"                }            }        }    }}

An important thing to note here is creating a custom analyzer(keyword with lowercase filter), so that field on which we are creating filter stored as it is but with small letter, as that is what you are passing in your query. Now above query returns you both your document, this is the postman collection that has index creation, sample docs creation and query which return both docs returned.

ES 7.X solution

The issue is that you are defining your state field as text field and then in your filter, you are using [term][1] query which is not analyzed as explained in official ES doc.

Returns documents that contain an exact term in a provided field.

Hence it would try to find token `Michoacán de Ocampo` in inverted index which isn't present as state field is defined as text and generates 3 tokens `michoacán`, `de` and `ocampo` and ES works on token(search term) to token(inverted index) match. You can check these tokens with [analyze API][2] and can use [explain API][3] to see the tokens generated by ES when the query has resultsFix---Define `state` field as a [multi-field][4] and store it as it is(kwyword form) so that you can filter on it.    {        "mappings": {            "properties": {                "state": {                    "type": "text",                    "fields": {                        "raw": {                            "type": "keyword"                        }                    }                },                "city": {                    "type": "text"                },                "colony": {                    "type": "text"                }            }        }    }Now below query would give you both results.    {        "query": {            "bool": {                "must": [                    {                        "match": {                            "state": {                                "query": "michoacán de ocampo"                            }                        }                    },                    {                        "match": {                            "colony": {                                "query": "zamora"                            }                        }                    },                    {                        "match": {                            "city": {                                "query": "zamora"                            }                        }                    }                ],                "filter": {                    "term": {                        "state.raw": "Michoacán de Ocampo" -->notice .raw to search on keyword field.                    }                }            }        }    }

EDIT: - https://www.getpostman.com/collections/f4b9ed00d50e2f4bc7f4 is the postman collection link if you want to quickly test it.


my guess is that the mapping of your state field is the default one, i.e., state is a text field, with a keyword sub-field (see dynamic field mapping).

If this is the case, then the filter of your first query "works" because it matches one of the tokens created by the default text analyzers. In fact, "Michoacán de Ocampo" is processed into these three lowercase tokens: ["michoacán", "de", "ocampo" ].

For the same reason, the second filter cannot match, because you are keeping the phrase "Michoacán de Ocampo" with the case. What should work is the following query:

{  "query": {    "bool": {      "must": [        {          "match": {            "state": {              "query": "michoacán de ocampo"            }          }        },        {          "match": {            "colony": {              "query": "zamora"            }          }        },        {          "match": {            "city": {              "query": "zamora"            }          }        }      ],      "filter": {        "term": {          "state.keyword": "Michoacán de Ocampo"        }      }    }  }}