PostgreSQL performance difference between LIKE and regex PostgreSQL performance difference between LIKE and regex postgresql postgresql

PostgreSQL performance difference between LIKE and regex


The text LIKE text operator (~~) is implemented by specific C code in like_match.c. It's ad-hoc code that is completely independent from regular expressions. Looking at the comments, it's obviously specially optimized to implement only % and _ as wildcards, and short-circuiting to an exit whenever possible, whereas a regular expression engine is more complex by several orders of magnitude.

Note that in your test case , just like the regexp is suboptimal compared to LIKE, LIKE is probably suboptimal compared to strpos(name, '***') > 0

strpos is implemented with the Boyer–Moore–Horspool algorithm which is optimized for large substrings with few partial matches in the searched text.

Internally these functions are reasonably optimized but when there are several methods to the same goal, choosing the likely best is still the job of the caller. PostgreSQL will not analyze for us the pattern to match and switch a regexp into a LIKE or a LIKE into a strpos based on that analysis.


I am not sure if I should publish it like an answer... I made a rough comparison making something similar in PHP - filtering huge array using regex and simple strpos (as a substitute to LIKE). The code:

// regex filter$filteredRegex = array_filter($a,function($item){    return preg_match('/000/',$item);});// substring search filter$filteredStrpos = array_filter($a,function($item){    return strpos($item,'000')!==FALSE;});

So, benchmarking this code results in that regex filter doubles the result of strpos in time, so I can suppose that CPU cost of regex is roughly double of simple search of substring.

Looks like @zerkms had all the reason :)