List all array elements of a MySQL JSON field List all array elements of a MySQL JSON field json json

List all array elements of a MySQL JSON field


Here's the setup:

create table t ( id serial primary key, content json);insert into t set content = '{"tags": ["tag_1", "tag_2"]}';insert into t set content = '{"tags": ["tag_3", "tag_2"]}';insert into t set content = '{"tags": ["tag_1", "tag_2"]}';

If you know the maximum number of tags in any tag array, you can extract all the tags using UNION:

select id, json_extract(content, '$.tags[0]') AS tag from t unionselect id, json_extract(content, '$.tags[1]') from t;+----+---------+| id | tag     |+----+---------+|  1 | "tag_1" ||  2 | "tag_3" ||  3 | "tag_1" ||  1 | "tag_2" ||  2 | "tag_2" ||  3 | "tag_2" |+----+---------+

You need as many unioned subqueries as the number of tags in the longest array.

Then you can put this in a derived table and perform an aggregation on it:

select tag, count(*) as countfrom (     select id, json_extract(content, '$.tags[0]') as tag from t     union     select id, json_extract(content, '$.tags[1]') from t) as t2group by tagorder by count desc;+---------+-------+| tag     | count |+---------+-------+| "tag_2" |     3 || "tag_1" |     2 || "tag_3" |     1 |+---------+-------+

This would be easier if you stored tags in a second table instead of in a JSON array:

create table tags ( id bigint unsigned, tag varchar(20) not null, primary key (id, tag));insert into tags set id = 1, tag = 'tag_1';insert into tags set id = 1, tag = 'tag_2';insert into tags set id = 2, tag = 'tag_3';insert into tags set id = 2, tag = 'tag_2';insert into tags set id = 3, tag = 'tag_1';insert into tags set id = 3, tag = 'tag_2';select tag, count(*) as count from tagsgroup by tagorder by count desc;+-------+-------+| tag   | count |+-------+-------+| tag_2 |     3 || tag_1 |     2 || tag_3 |     1 |+-------+-------+

This solutions works no matter how many tags per id you have. You don't need to know the max length of the list of tags per id.

JSON is nice when you need to store a 'document' of semi-structured data, but only when you treat the document as one irreducible data value. As soon as you need to access elements of the document and apply relational operations to them, the document-oriented approach shows its weakness.