Logstash + Kibana terms panel without breaking words Logstash + Kibana terms panel without breaking words json json

Logstash + Kibana terms panel without breaking words


Your problem here is that your data is being tokenized. This is helpful to make any search over your data. ES (by default) will split your field message split into different parts to be able to search them. For example you may want to search for the word ERROR in your logs, so you probably would like to see in the results messages like "There was an error in your cluster" or "Error processing whatever". If you don't analyze the data for that field with tokenizers, you won't be able to search like this.

This analyzed behaviour is helpful when you want to search things, but it doesn't allow you to group when different messages that have the same content. This is your usecase. The solution to this is to update your mapping putting not_analyzed for that specific field that you don't want to split into tokens. This will probably work for your host field, but will probably break the search.

What I usually do for these kind of situations is to use index templates and multifields. The index template allow me to set a mapping for every index that match a regex and the multifields allow me to have the analyzed and not_analyzed behaviour in a same field.

Using the following query would do the job for your problem:

curl -XPUT https://example.org/_template/name_of_index_template -d '{    "template": "indexname*",    "mappings": {        "type": {            "properties": {               "field_name": {                  "type": "multi_field",                  "fields": {                     "field_name": {                         "type": "string",                         "index": "analyzed"                     },                     "untouched": {                         "type": "string",                         "index": "not_analyzed"                     }                                       }            }        }    }}'

And then in your terms panel you can use field.untouched, to consider the entire content of the field when you calculate the count of the different elements.

If you don't want to use index templates (maybe your data is in a single index), setting the mapping with the Put Mapping API would do the job too. And if you use multifields, there is no need to reindex the data, because from the moment that you set the new mapping for the index, the new data will be duplicated in these two subfields (field_name and field_name.untouched). If you just change the mapping from analyzed to not_analyzed you won't be able to see any change until you reindex all your data.


Since you didn't define a mapping in elasticsearch, the default settings takes place for every field in your type in your index. The default settings for string fields (like your server field) is to analyze the field, meaning that elastic search will tokenize the field contents. That is why its splitting your server names to parts.

You can overcome this issue by defining a mapping. You don't have to define all your fields, but only the ones that you don't want elasticsearch to analyze. In your particular case, sending the following put command will do the trick:

http://[host]:9200/[index_name]/_mapping/[type]{    "type" : {        "properties" : {            "server" : {"type" : "string", "index" : "not_analyzed"}        }    }}

You can't do this on an already existing index because switching from analyzed to not_analyzed is a major change in the mapping.