Performance implication of LIKE query when operating on a subset of full table
@MatBailie already cleared your primary question. I want to address an assertion of yours:
I appreciate that LIKE queries are slow as they cannot be indexed.
This is not entirely true.
Firstly, and this has been true for a long time, left anchored patterns can use an index. This works for regular expressions (~
) as well as LIKE
(~~
) and SIMILAR TO
. I recently wrote a comprehensive review on the matter on dba.SE:
This may not work for you, because the patterns in your question are not anchored. If they were you could get optimized performance with a multicolumn index that uses the text pattern operator class text_pattern_ops
for the message
column like this:
CREATE INDEX tbl_user_id_message_idx ON tbl (user_id, message text_pattern_ops);
For queries like:
SELECT *FROM tblWHERE user_id = 2AND message ~~ 'bar%'; -- left anchored LIKE
Secondly, since PostgreSQL 9.1 you can use the pg_trgm
extension and create a GIST or GIN index with it, that all patterns can use. Some limitations apply. Maintenance of such an index is more expensive, so it is most useful for read-only or rarely written tables. Details:
Depesz has a related tutorial.
The optimiser determines many things when compiling SQL into a plan.
One of them is how to filter data (with index seeks, etc) before applying other conditions on a row by row basis.
In your case, provided you have a suitable index the LIKE will only be applied to the records after that filtering is done.
To understand a bit more about it, get the plan that is created by your query. You should be able to see where indexes are used to sub-set/filter the data, and then a separate step applying the LIKE condition.