Levenshtein Distance Algorithm better than O(n*m)? Levenshtein Distance Algorithm better than O(n*m)? ios ios

Levenshtein Distance Algorithm better than O(n*m)?


Are you interested in reducing the time complexity or the space complexity ? The average time complexity can be reduced O(n + d^2), where n is the length of the longer string and d is the edit distance. If you are only interested in the edit distance and not interested in reconstructing the edit sequence, you only need to keep the last two rows of the matrix in memory, so that will be order(n).

If you can afford to approximate, there are poly-logarithmic approximations.

For the O(n +d^2) algorithm look for Ukkonen's optimization or its enhancement Enhanced Ukkonen. The best approximation that I know of is this one by Andoni, Krauthgamer, Onak


If you only want the threshold function - eg, to test if the distance is under a certain threshold - you can reduce the time and space complexity by only calculating the n values either side of the main diagonal in the array. You can also use Levenshtein Automata to evaluate many words against a single base word in O(n) time - and the construction of the automatons can be done in O(m) time, too.


Look in Wiki - they have some ideas to improve this algorithm to better space complexity:

Wiki-Link: Levenshtein distance

Quoting:

We can adapt the algorithm to use less space, O(m) instead of O(mn), since it only requires that the previous row and current row be stored at any one time.