Difference between LIKE and ~ in Postgres Difference between LIKE and ~ in Postgres postgresql postgresql

Difference between LIKE and ~ in Postgres


~ is the regular expression operator, and has the capabilities implied by that. You can specify a full range of regular expression wildcards and quantifiers; see the documentation for details. It is certainly more powerful than LIKE, and should be used when that power is needed, but they serve different purposes.


There is nothing wrong with LIKE and, IMO, no reason to favour ~ over it. Rather the opposite. LIKE is SQL-standard. So is SIMILAR TO, but it isn't widely supported. PostgreSQL's ~ operator (or posix regular expression matching operator) isn't SQL standard.

For that reason, I prefer to use LIKE where it's expressive enough and I only use ~ when I need the power of full regular expressions. If I ever need to port databases it's one less thing that'll hurt. I've tended to use SIMILAR TO when LIKE isn't powerful enough, but after Erwin's comments I think I'll stop doing that and use ~ when LIKE doesn't do the job.

Also, PostgreSQL can use a b-tree index for prefix searches (eg LIKE 'TEST%') with LIKE or SIMILAR TO if the database is in the C locale or the index has text_pattern_ops. Contrary to what I wrote earlier, Pg can also use such an index for a left-anchored posix regex, it just needs an explicit '^TEST.*' so the regex can only match from the beginning. My post earlier incorrectly stated that ~ couldn't use an index for a prefix search. With that difference eliminated it's really down to whether you want to stick to standard compliant features where possible or not.

See this demo SQLFiddle; note the different execution plans. Note the difference between ~ '1234.*' and ~ '^1234.*'.

Given sample data:

create table test (   blah text);insert into test (blah)  select x::text from generate_series(1,10000) x;create index test_blah_txtpat_idx ON test(blah text_pattern_ops);

note that ~ uses a seqscan even when it's substantially more expensive (artifically so due to enable_seqscan) because it has no alternative, while LIKE uses the index. However, a corrected ~ with a left anchor uses the index too:

regress=# SET enable_seqscan = 'f';SETregress=# explain select 1 from test where blah ~ '12.*';                                QUERY PLAN                                 --------------------------------------------------------------------------- Seq Scan on test  (cost=10000000000.00..10000000118.69 rows=2122 width=0)   Filter: (blah ~ '12.*'::text)(2 rows)regress=# explain select 1 from test where blah like '12%';                                     QUERY PLAN                                     ------------------------------------------------------------------------------------ Bitmap Heap Scan on test  (cost=4.55..46.76 rows=29 width=0)   Filter: (blah ~~ '12%'::text)   ->  Bitmap Index Scan on test_blah_txtpat_idx  (cost=0.00..4.54 rows=29 width=0)         Index Cond: ((blah ~>=~ '12'::text) AND (blah ~<~ '13'::text))(4 rows)regress=# explain select 1 from test where blah ~ '^12.*';                                     QUERY PLAN                                      ------------------------------------------------------------------------------------- Bitmap Heap Scan on test  (cost=5.28..51.53 rows=101 width=0)   Filter: (blah ~ '^12.*'::text)   ->  Bitmap Index Scan on test_blah_txtpat_idx  (cost=0.00..5.25 rows=100 width=0)         Index Cond: ((blah ~>=~ '12'::text) AND (blah ~<~ '13'::text))(4 rows)


Overview

LIKE, SIMILAR TO and ~ are the basic pattern matching operators in PostgreSQL.

If you can, use LIKE (~~), it's fastest and simplest.
If you can't, use a regular expression (~), it's more powerful.
Never user SIMILAR TO. It's pointless. See below.

Installing the additional module pg_trgm adds advanced index options and the similarity operator %.
And there is also text search with its own infrastructure and the @@ operator (among others).

Index support is available for each of these operators - to a varying degree. It regularly trumps the performance of other options. But there is a lot of leeway in the details, even with indexes.

Index support

Without pg_trgm, there is only index support for left anchored search patterns. If your database cluster runs with a non-C locale (typical case), you need an index with a special operator class for that, like text_pattern_ops or varchar_pattern_ops. Basic left-anchored regular expressions are supported by this, too. Example:

CREATE TABLE tbl(string text);INSERT INTO  tbl(string)SELECT x::text FROM generate_series(1, 10000) x;CREATE INDEX tbl_string_text_pattern_idx ON tbl(string text_pattern_ops);SELECT * FROM tbl WHERE string ~ '^1234';  -- left anchored pattern

db<>fiddle here

With pg_trgm installed, GIN or GiST indexes are possible with the operator classes gist_trgm_ops or gin_trgm_ops. These indexes support any LIKE expression, not just left anchored. And, quoting the manual:

Beginning in PostgreSQL 9.3, these index types also support index searches for regular-expression matches.

Details:


SIMILAR TO is a very odd construct. PostgreSQL only implements it because it was defined in early versions of the SQL standard. Internally, every SIMILAR TO expression is rewritten with a regular expression. Therefore, for any given SIMILAR TO expression, there is at least one regular expression doing the same job faster. I never use SIMILAR TO.

Further reading: