Best way to deal with misspellings in a MySQL fulltext search

php mysql lucene full-text-search sphinx

I think you should use SOUNDS LIKE or SOUNDEX()

As your data set is so small, one solution may be to create a new table to store the individual words or soundex values contained in each text field and use SOUNDS LIKE on that table.

e.g:

SELECT * FROM table where id IN (    SELECT refid FROM tableofwords     WHERE column SOUNDS LIKE 'right' OR column SOUNDS LIKE 'shlder')

see: http://dev.mysql.com/doc/refman/5.0/en/string-functions.html

I belive it is not possible to wild card seach the string :(

php mysql lucene full-text-search sphinx

MySQL doesn't support SOUNDEX search in fulltext.

If you want to implemente a lucene like framework, it means that you have to take all the documents, splits them into words, and then builds an index for each word.

When someone search for "right shlder" you have to make a SOUNDEX search for each words in the worlds table:

    $search = 'right shlder';preg_match_all('(\w+)', $search, $matches);if (!empty($matches[0]))   $sounds = array_map('soundex', $matches[0]);$query = 'SELECT word FROM words_list    WHERE SOUNDEX(word) IN(\''.join('\',\'',$sounds).'\')';

and then make a fulltext search:

$query2 = 'SELECT * FROM table    WHERE MATCH(fultextcolumn)    AGAINST ('.join (' OR ', $resuls).' IN BINARY MODE)';

Where $result is an array with the results of the first query.

php mysql lucene full-text-search sphinx

The technical term for what you are looking for, is Levenshtein distance which is used to calculate the difference between two sequences (in this case a sequence of characters which is a string).

PHP actually has two built in function for that, the first being similar_text and the other called levenshtein which should help you out with your problem. You will have to benchmark if it is fast enough for your needs.

CodeHunter

Best way to deal with misspellings in a MySQL fulltext search

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last