Elasticsearch: Scoring with Ngrams Elasticsearch: Scoring with Ngrams elasticsearch elasticsearch

Elasticsearch: Scoring with Ngrams


You can solve that using an edgeNGram tokenizer instead of an edgeNGram filter:

 settings: {    analysis: {      tokenizer: {        ngram_tokenizer: {          type: 'edge_ngram',          min_gram: 2,          max_gram: 15        }      },      analyzer: {        ngram_analyzer: {          type: 'custom',          tokenizer: 'ngram_tokenizer',          filter: [            'lowercase'          ]        }      }    }  }

The reason for this is that the edgeNGram filter will write the terms for a given token at the same position (pretty much like synonyms would do), while the edgeNGram tokenizer will create tokens which have different positions, hence influencing the length normalization, hence the score.

Note that this works only on pre-2.0 ES releases, because a compound score is computed from all ngram tokens scores, whereas in ES 2.x only the matching token is scored.