Implementation of Levenshtein distance for mysql/fuzzy search? Implementation of Levenshtein distance for mysql/fuzzy search? database database

Implementation of Levenshtein distance for mysql/fuzzy search?


In order to efficiently search using levenshtein distance, you need an efficient, specialised index, such as a bk-tree. Unfortunately, no database system I know of, including MySQL, implements bk-tree indexes. This is further complicated if you're looking for full-text search, instead of just a single term per row. Off-hand, I can't think of any way that you could do full-text indexing in a manner that allows for searching based on levenshtein distance.


There is a mysql UDF implementation of Levenshtein Distance function

https://github.com/jmcejuela/Levenshtein-MySQL-UDF

It is implemented in C and has better performance than the "MySQL Levenshtein distance query" mentioned by schnaader


An implementation for the damerau-levenshtein distance can be found here:Damerau-Levenshtein algorithm: Levenshtein with transpositionsThe improvement over pure Levenshtein distance is that the swapping of characters is considered. I found it in the comments of schnaader's link, thanks!