How to replace every other instance of a particular character in a MySQL string? How to replace every other instance of a particular character in a MySQL string? sql sql

How to replace every other instance of a particular character in a MySQL string?


You should consider to store your data in a normalized schema. In your case the table should look like:

| id | k |        v ||----|---|----------||  1 | A |       10 ||  1 | B |       20 ||  1 | C |       30 ||  2 | A | Positive ||  2 | B | Negative |

This schema is more flexible and you will see why.

So how to convert the given data into the new schema? You will need a helper table containing sequence numbers. Since your column is varchar(255) you can only store 128 values (+ 127 delimiters) in it. But let's just create 1000 numbers. You can use any table with enough rows. But since any MySQL server has the information_schema.columns table, I will use it.

drop table if exists helper_sequence;create table helper_sequence (i int auto_increment primary key)    select null as i    from information_schema.columns c1    join information_schema.columns c2    limit 1000;

We will use this numbers as position of the values in your string by joining the two tables.

To extract a value from a delimited string you can use the substring_index() function. The value at position i will be

substring_index(substring_index(t.options, '|', i  ), '|', -1)

In your string you have a sequence of keys followed by its values. The position of a key is an odd number. So if the position of the key is i, the position of the corresponding value will be i+1

To get the number of the delimiters in the string and limit our join we can use

char_length(t.options) - char_length(replace(t.options, '|', ''))

The query to store the data in a normalized form would be:

create table normalized_table    select t.id        , substring_index(substring_index(t.options, '|', i  ), '|', -1) as k        , substring_index(substring_index(t.options, '|', i+1), '|', -1) as v    from old_table t    join helper_sequence s      on s.i <= char_length(t.options) - char_length(replace(t.options, '|', ''))    where s.i % 2 = 1

Now run select * from normalized_table and you will get this:

| id | k |        v ||----|---|----------||  1 | A |       10 ||  1 | B |       20 ||  1 | C |       30 ||  2 | A | Positive ||  2 | B | Negative |

So why is this format a better choice? Besides many other reasons, one is that you can easily convert it to your old schema with

select id, group_concat(concat(k, '|', v) order by k separator '|') as optionsfrom normalized_tablegroup by id;| id |               options ||----|-----------------------||  1 |        A|10|B|20|C|30 ||  2 | A|Positive|B|Negative |

or to your desired format

select id, group_concat(concat(k, '|', v) order by k separator ',') as optionsfrom normalized_tablegroup by id;| id |               options ||----|-----------------------||  1 |        A|10,B|20,C|30 ||  2 | A|Positive,B|Negative |

If you don't care about normalization and just want this task to be done, you can update your table with

update old_table ojoin (    select id, group_concat(concat(k, '|', v) order by k separator ',') as options    from normalized_table    group by id) n using (id)set o.options = n.options;

And drop the normalized_table.

But then you won't be able to use simple queries like

select *from normalized_tablewhere k = 'A'

See demo at rextester.com


Not using stored procedures, I would do it in 2 steps:

  1. Insert the comma at the second occurrence of the pipe character:

    update options set options = insert(options, locate('|', options, locate('|', options) + 1), 1, ',');
  2. Insert the remaining commas - execute the query N times:

    update options set options = insert(options, locate('|', options, locate('|', options, length(options) - locate(',', reverse(options)) + 1) + 1), 1, ',');

    where N =

    select max(round(((length(options) - length(replace(options, '|', ''))) - 1 ) / 2) - 1) from options;

    (or don't bother with counting and continue to execute the query as long as it doesn't tell you "0 rows affected")

Checked with this set of data:

id   options1    A|10|B|20|C|302    A|Positive|B|Negative3    A|10|B|20|C|30|D|40|E|50|F|604    A|Positive|B|Negative|C|Neutral|D|Dunno

results in:

id   options1    A|10,B|20,C|302    A|Positive,B|Negative3    A|10,B|20,C|30,D|40,E|50,F|604    A|Positive,B|Negative,C|Neutral,D|Dunno

(I'll provide an explanation later)


Demo

Rextester demo

Explanation

This could be solved relatively easily if only MySQL had a regular expression replacement function but unfortunately it doesn't. So I wrote one - see this blog post. The "advanced version" is needed here to allows it to perform a recursive replace in the found match for the replacement. Then the following relatively simple SQL can be used:

SQL (function code omitted for brevity)

SELECT id,       options AS `before`,       reg_replace(options,                   '\\|.*\\|', -- 2 pipe symbols with any text in between                   '\\|$',     -- Replace the second pipe symbol                   ',',        -- Replace with a comma                   FALSE,      -- Non-greedy matching                   2,          -- Min match length = 2 (2 pipe symbols)                   0,          -- No max match length                   0,          -- Min sub-match length = 1 (1 pipe symbol)                   0           -- Max sub-match length = 1 (1 pipe symbol)                   ) AS `after`FROM tbl;