SQLite: Efficient substring search in large table SQLite: Efficient substring search in large table sqlite sqlite

SQLite: Efficient substring search in large table


Solution 1:If you can make every character in your database as an individual word, you can use phrase queries to search the substring.

For example, assume "my_table" contains a single column "person":

person------John DoeJane Doe

you can change it to

person------J o h n D o eJ a n e D o e

To search the substring "ohn", use phrase query:

SELECT * FROM my_table WHERE person MATCH '"o h n"'

Beware that "JohnD" will match "John Doe", which may not be desired.To fix it, change the space character in the original string into something else.

For example, you can replace the space character with "$":

person------J o h n $ D o eJ a n e $ D o e

Solution 2:Following the idea of solution 1, you can make every character as an individual word with a custom tokenizer and use phrase queries to query substrings.

The advantage over solution 1 is that you don't have to add spaces in your data, which can unnecessarily increase the size of database.

The disadvantage is that you have to implement the custom tokenizer. Fortunately, I have one ready for you. The code is in C, so you have to figure out how to integrate it with your Java code.


You should add an index to the name column on your database, that should speed up the query considerably.

I believe SQLite3 supports sub-string matching like so:

SELECT * FROM Elements WHERE name MATCH '*foo*';

http://www.sqlite.org/fts3.html#section_3


I am facing some thing similar to your problem. Here is my suggestion try creating a translation table that will translate all the words to numbers. Then search numbers instead of words.

Please let me know if this is helping.