Escape function for regular expression or LIKE patterns
To address the question at the top:
Regular expression escape function
Let's start with a complete list of characters with special meaning in regular expression patterns:
!$()*+.:<=>?[\]^{|}-
Wrapped in a bracket expression most of them lose their special meaning - with a few exceptions:
-
needs to be first or last or it signifies a range of characters.]
and\
have to be escaped with\
(in the replacement, too).
After adding capturing parentheses for the back reference below we get this regexp pattern:
([!$()*+.:<=>?[\\\]^{|}-])
Using it, this function escapes all special characters with a backslash (\
) - thereby removing the special meaning:
CREATE OR REPLACE FUNCTION f_regexp_escape(text) RETURNS text LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE AS$func$SELECT regexp_replace($1, '([!$()*+.:<=>?[\\\]^{|}-])', '\\\1', 'g')$func$;
Add PARALLEL SAFE
(because it is) in Postgres 10 or later to allow parallelism for queries using it.
Demo
SELECT f_regexp_escape('test(1) > Foo*');
Returns:
test\(1\) \> Foo\*
And while:
SELECT 'test(1) > Foo*' ~ 'test(1) > Foo*';
returns FALSE
, which may come as a surprise to naive users,
SELECT 'test(1) > Foo*' ~ f_regexp_escape('test(1) > Foo*');
Returns TRUE
as it should now.
LIKE
escape function
For completeness, the pendant for LIKE
patterns, where only three characters are special:
\%_
The default escape character is the backslash but a different one can be selected by using the
ESCAPE
clause.
This function assumes the default:
CREATE OR REPLACE FUNCTION f_like_escape(text) RETURNS text LANGUAGE sql IMMUTABLE STRICT PARALLEL SAFE AS$func$SELECT replace(replace(replace($1 , '\', '\\') -- must come 1st , '%', '\%') , '_', '\_');$func$;
We could use the more elegant regexp_replace()
here, too, but for the few characters, a cascade of replace()
functions is faster.
Again, PARALLEL SAFE
in Postgres 10 or later.
Demo
SELECT f_like_escape('20% \ 50% low_prices');
Returns:
20\% \\ 50\% low\_prices
how about trying something like this, substituting var_name
for my hard-coded 'John Bernard'
:
create table my_table(name text primary key);insert into my_table(name) values ('John Bernard'), ('John Bernard (1)'), ('John Bernard (2)'), ('John Bernard (3)');select max(regexp_replace(substring(name, 13), ' |\(|\)', '', 'g')::integer+1) from my_table where substring(name, 1, 12)='John Bernard' and substring(name, 13)~'^ \([1-9][0-9]*\)$'; max----- 4(1 row)
one caveat: I am assuming single-user access to the database while this process is running (and so are you in your approach). If that is not the case then the max(n)+1
approach will not be a good one.
Are you at liberty to change the schema? I think the problem would go away if you could use a composite primary key:
name text not null,number integer not null,primary key (name, number)
It then becomes the duty of the display layer to display Fred #0 as "Fred", Fred #1 as "Fred (1)", &c.
If you like, you can create a view for this duty. Here's the data:
=> select * from foo; name | number --------+-------- Fred | 0 Fred | 1 Barney | 0 Betty | 0 Betty | 1 Betty | 2(6 rows)
The view:
create or replace view foo_view asselect *,case when number = 0 then name else name || ' (' || number || ')'end as name_and_numberfrom foo;
And the result:
=> select * from foo_view; name | number | name_and_number --------+--------+----------------- Fred | 0 | Fred Fred | 1 | Fred (1) Barney | 0 | Barney Betty | 0 | Betty Betty | 1 | Betty (1) Betty | 2 | Betty (2)(6 rows)