Split string at specific character SQL-Standard Split string at specific character SQL-Standard postgresql postgresql

Split string at specific character SQL-Standard


Your second example is a bit confusing because you are mixing 'ABC_AB_A' and 'AB_XXX' not sure if that is typo.

But if you just want all characters before the first _ then the following works in Postgres:

left(col, strpos(col, '_') - 1)

or using a regular expression:

substring(col from '([A-Z]+)(_{1})')

You can use a regular expression in Oracle as well:

regexp_substr(col, '([A-Z]+)(_{1})', 1, 1, 'i', 1)

Postgres' substring function always returns the first capturing group of the regex whereas in Oracle you can specify the group you want: that is the last parameter to the regexp_substr() function.

SQLFiddle for Oracle: http://sqlfiddle.com/#!4/b138c/1
SQLFiddle for Postgres: http://sqlfiddle.com/#!15/4b2bb/1


tl;dr

Use split_part which was purposely built for this:

split_part(string, '_', 1)

Explanation

Quoting this API docs:

SPLIT_PART() function splits a string on a specified delimiter and returns the nth substring.

The 3 parameters are the string to be split, the delimiter, and the part/substring number (starting from 1) to be returned.

So if you have a field named string that contains stuff like AB_XXX and you would like to get everything before _, then you split by that and get the first part/substring: split_part(string, '_', 1).


The standard SQL string functions are described at:SQL String Functions and Operators.

There's a substring function that can extract contents directly, without having to nest function calls. It's detailed in Pattern matching as:

The substring function with three parameters, substring(string from pattern for escape-character), provides extraction of a substring that matches an SQL regular expression pattern. As with SIMILAR TO, the specified pattern must match the entire data string, or else the function fails and returns null. To indicate the part of the pattern that should be returned on success, the pattern must contain two occurrences of the escape character followed by a double quote ("). The text matching the portion of the pattern between these markers is returned.

In your case:

select substring('AB_XX' from '#"%#"#_%' for '#');

Result:

 substring ----------- AB(1 row)

The syntax is a bit weird, especially since _ is a wildcard for a single character so it has to be quoted, but it's the SQL standard.

For the syntax that more people use, consider regexp_replace() or a similar function working with POSIX regular expressions.