Apache solr search part of the word Apache solr search part of the word django django

Apache solr search part of the word


Note: The following solution is Solr 1.4 (and above) specific!

For more flexibility, I would recommend indexing your data with the NGramTokenizerFactory to do complete front and back wildcard searches. If you just want to search for substrings at the beginning or end of the string, consider using the EdgeNGramTokenizerFactory.

Here's a drop in replacement of the text field type which would accomodate your need:

<fieldType name="text" class="solr.TextField" ><analyzer type="index">    <tokenizer class="solr.NGramTokenizerFactory" minGramSize="3" maxGramSize="15" />    <filter class="solr.LowerCaseFilterFactory"/></analyzer><analyzer type="query">    <tokenizer class="solr.WhitespaceTokenizerFactory" />    <filter class="solr.LowerCaseFilterFactory"/></analyzer></fieldType>


If you want to find all words that start with chick, search for chick*.


When I've used

<tokenizer class="solr.NGramTokenizerFactory" minGramSize="3" maxGramSize="15" />

for making wildcard search from Brian's answer, Solr indexing time dramaticly increased. In more than 20 times!The other decision of wildcard searching problem I found here:

http://www.lucidimagination.com/blog/2009/09/08/auto-suggest-from-popular-queries-using-edgengrams/

You need just add filter

<filter class="solr.EdgeNGramFilterFactory" minGramSize="1" maxGramSize="25" />

(default tokenizer - solr.WhitespaceTokenizerFactory in index block of FieldType). For me result was the same with less system costs.