indexing twitter data into elasticsearch: Limit of total fields [1000] in index has been exceeded

elasticsearch twitter

This limit has been introduced in following GitHub issue.

The command counts grep type | wc -l counts the number of lines with text "type". Therefore I guess there is a chance for the count to be inaccurate. I did a small text and I got a higher value than the actual number of fields. So you could get less than the actual number of fields as well, but I can't think of a scenario yet.

Here's the test I did.

curl -s -XGET http://localhost:9200/stackoverflow/_mapping?pretty{  "stackoverflow" : {    "mappings" : {      "os" : {        "properties" : {          "NAME" : {            "type" : "text",            "fields" : {              "keyword" : {                "type" : "keyword",                "ignore_above" : 256              }            }          },          "TITLE" : {            "type" : "text",            "fields" : {              "keyword" : {                "type" : "keyword",                "ignore_above" : 256              }            },            "fielddata" : true          },          "title" : {            "type" : "text",            "fielddata" : true          }        }      }    }  }}

Since the "type" is there in 5 lines I get the output as 5 even though I only have 3 fields.

Can you try increasing the limit and see if it works?

PUT my_index/_settings{  "index.mapping.total_fields.limit": 2000}

You can also increase this limit during index creation.

PUT my_index{  "settings": {    "index.mapping.total_fields.limit": 2000,    "number_of_shards": 1,    "number_of_replicas": 0  },  "mappings": {    ...  }}

Credits: https://discuss.elastic.co/t/total-fields-limit-setting/53004/2

elasticsearch twitter

You can change the setting of your ES domain, by running following command in the kibana or in postman. Just replace the ElasticSearch URL and the index name and this should run perfectly.

PUT /my_index/_settings HTTP/1.1Host: search-test-prhtf12546bw2qdr6lfr2vq.us-east-1.es.amazonaws.comContent-Type: application/json{    "index": {        "mapping": {            "total_fields": {                "limit": "100000"            }        }    }}

It will give you following response:

{    "acknowledged": true}

elasticsearch twitter

Defining too many fields in an index is a condition that can lead to a mapping explosion, which can cause out of memory errors and difficult situations to recover from. As an example, consider a situation in which every new document inserted introduces new fields. This is quite common with dynamic mappings. Every time a document contains new fields, those will end up in the index’s mappings. This isn’t worrying for a small amount of data, but it can become a problem as the mapping grows.

If you have nested fields which can grow and not under applications control then try to map the field as flattened. This data type can be useful for indexing objects with a large or unknown number of unique keys. Only one field mapping is created for the whole JSON object, which can help prevent a mappings explosion from having too many distinct field mappings.

Reference:https://www.elastic.co/guide/en/elasticsearch/reference/current/flattened.html

CodeHunter

indexing twitter data into elasticsearch: Limit of total fields [1000] in index has been exceeded

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last