how to configure the synonyms_path in elasticsearch how to configure the synonyms_path in elasticsearch elasticsearch elasticsearch

how to configure the synonyms_path in elasticsearch


I don't know, if your problem is because you defined bad the synonyms for "bar". As you said you are pretty new I'm going to put an example similar to yours that works. I want to show how elasticsearch deal with synonyms at search time and at index time. Hope it helps.

First thing create the synonym file:

foo => foo bar, baz

Now I create the index with the particular settings you are trying to test:

curl -XPUT 'http://localhost:9200/test/' -d '{  "settings": {    "index": {      "analysis": {        "analyzer": {          "synonym": {            "tokenizer": "whitespace",            "filter": ["synonym"]          }        },        "filter" : {          "synonym" : {              "type" : "synonym",              "synonyms_path" : "synonyms.txt"          }        }      }    }  },  "mappings": {    "test" : {      "properties" : {        "text_1" : {           "type" : "string",           "analyzer" : "synonym"        },        "text_2" : {           "search_analyzer" : "standard",           "index_analyzer" : "standard",           "type" : "string"        },        "text_3" : {           "type" : "string",           "search_analyzer" : "synonym",           "index_analyzer" : "standard"        }      }    }  }}'

Note that synonyms.txt must be in the same directory that the configuration file since that path is relative to the config dir.

Now index a doc:

curl -XPUT 'http://localhost:9200/test/test/1' -d '{  "text_3": "baz dog cat",  "text_2": "foo dog cat",  "text_1": "foo dog cat"}'

Now the searches

Searching in field text_1

curl -XGET 'http://localhost:9200/test/_search?q=text_1:baz'{  "took": 3,  "timed_out": false,  "_shards": {    "total": 5,    "successful": 5,    "failed": 0  },  "hits": {    "total": 1,    "max_score": 0.15342641,    "hits": [      {        "_index": "test",        "_type": "test",        "_id": "1",        "_score": 0.15342641,        "_source": {          "text_3": "baz dog cat",          "text_2": "foo dog cat",          "text_1": "foo dog cat"        }      }    ]  }}

You get the document because baz is synonym of foo and at index time foo is expanded with its synonyms

Searching in field text_2

curl -XGET 'http://localhost:9200/test/_search?q=text_2:baz'

result:

{  "took": 2,  "timed_out": false,  "_shards": {    "total": 5,    "successful": 5,    "failed": 0  },  "hits": {    "total": 0,    "max_score": null,    "hits": []  }}

I don't get hits because I didn't expand synonyms while indexing (standard analyzer). And, since I'm searching baz and baz is not in the text, I don't get any result.

Searching in field text_3

curl -XGET 'http://localhost:9200/test/_search?q=text_3:foo'{  "took": 3,  "timed_out": false,  "_shards": {    "total": 5,    "successful": 5,    "failed": 0  },  "hits": {    "total": 1,    "max_score": 0.15342641,    "hits": [      {        "_index": "test",        "_type": "test",        "_id": "1",        "_score": 0.15342641,        "_source": {          "text_3": "baz dog cat",          "text_2": "foo dog cat",          "text_1": "foo dog cat"        }      }    ]  }}

Note: text_3 is "baz dog cat"

text_3 was indexes without expanding synonyms. As I'm searching for foo, which have "baz" as one of the synonyms I get the result.

If you want to debug you can use _analyze endpoint for example:

curl -XGET 'http://localhost:9200/test/_analyze?text=foo&analyzer=synonym&pretty=true'

result:

{  "tokens": [    {      "token": "foo",      "start_offset": 0,      "end_offset": 3,      "type": "SYNONYM",      "position": 1    },    {      "token": "baz",      "start_offset": 0,      "end_offset": 3,      "type": "SYNONYM",      "position": 1    },    {      "token": "bar",      "start_offset": 0,      "end_offset": 3,      "type": "SYNONYM",      "position": 2    }  ]}