elasticsearch analyzer - lowercase and whitespace tokenizer elasticsearch analyzer - lowercase and whitespace tokenizer elasticsearch elasticsearch

elasticsearch analyzer - lowercase and whitespace tokenizer


i managed to write a custom analyzer and this works...

"settings":{  "analysis": {    "analyzer": {      "lowercasespaceanalyzer": {        "type": "custom",        "tokenizer": "whitespace",        "filter": [          "lowercase"        ]      }    }  }},"mappings": { "my_type" : {  "properties" : {    "title" : { "type" : "string", "analyzer" : "lowercasespaceanalyzer", "tokenizer": "whitespace", "search_analyzer":"whitespace", "filter": [      "lowercase"    ] }  } }}


You have two options -

Simple Analyser

the simple analyser will probably meet your needs:

curl -XGET 'localhost:9200/myindex/_analyze?analyzer=simple&pretty' -d 'Some DATA' {  "tokens" : [ {    "token" : "some",    "start_offset" : 0,    "end_offset" : 4,    "type" : "word",    "position" : 1  }, {    "token" : "data",    "start_offset" : 5,    "end_offset" : 9,    "type" : "word",    "position" : 2  } ]}

To use the simple analyser in your mapping:

{ "mappings": {   "my_type" : {      "properties" : {        "title" : { "type" : "string", "analyzer" : "simple"}      }    }  }}

Custom Analyser

Second option is to define your own custom analyser and specify how to tokenise and filter the data. Then refer to this new analyser in your mapping.