Postgres 9.4 jsonb array as table
Query
Your table definition is missing. Assuming:
CREATE TABLE configuration ( config_id serial PRIMARY KEY, config jsonb NOT NULL);
To find the a value
and its row for given oid
and instance
:
SELECT c.config_id, d->>'value' AS valueFROM configuration c , jsonb_array_elements(config->'data') d -- default col name is "value"WHERE d->>'oid' = '1.3.6.1.4.1.7352.3.10.2.5.35.3'AND d->>'instance' = '0'AND d->>'value' <> '1'
That's an implicit LATERAL
join. Compare:
2) What is the fastest way to get a table with 3 columns of
oid
,instance
andvalue.
I suppose to use jsonb_populate_recordset()
, then you can provide data types in the table definition. Assuming text
for all:
CREATE TEMP TABLE data_pattern (oid text, value text, instance text);
Could also be a persisted (non-temp) table. This one is only for the current session. Then:
SELECT c.config_id, d.*FROM configuration c , jsonb_populate_recordset(NULL::data_pattern, c.config->'data') d
That's all. The first query rewritten:
SELECT c.config_id, d.*FROM configuration c , jsonb_populate_recordset(NULL::data_pattern, c.config->'data') dWHERE d.oid = '1.3.6.1.4.1.7352.3.10.2.5.35.3'AND d.instance = '0'AND d.value <> '1';
But that's slower than the first query. Key to performance with bigger table is index support:
Index
You could easily index the normalized (translated) table or the alternative layout you proposed in the question. Indexing your current layout is not as obvious, but also possible. For best performance I suggest a functional index on just the data
key with the jsonb_path_ops
operator class. Per documentation:
The technical difference between a
jsonb_ops
and ajsonb_path_ops
GIN index is that the former creates independent index items for each key and value in the data, while the latter creates index items only for each value in the data.
This should work wonders for performance:
CREATE INDEX configuration_my_idx ON configurationUSING gin ((config->'data') jsonb_path_ops);
One might expect that only a complete match for a JSON array element would work, like:
SELECT * FROM configurationWHERE (config->'data') @> '[{"oid": "1.3.6.1.4.1.7352.3.10.2.5.35.3" , "instance": "0", "value": "1234"}]';
Note the JSON array notation (with enclosing []
) of the provided value, that's required.
But array elements with a subset of keys work as well:
SELECT * FROM configurationWHERE (config->'data') @> '[{"oid": "1.3.6.1.4.1.7352.3.10.2.5.35.3" , "instance": "0"}]'
The hard part is to incorporate your seemingly unsuspicious added predicate value <> '1'
. Care must be taken to apply all predicates to the same array element. You could combine this with the first query:
SELECT c.*, d->>'value' AS valueFROM configuration c , jsonb_array_elements(config->'data') dWHERE (config->'data') @> '[{"oid": "1.3.6.1.4.1.7352.3.10.2.5.35.3", "instance": "0"}]'AND d->>'oid' = '1.3.6.1.4.1.7352.3.10.2.5.35.3' -- must be repeatedAND d->>'instance' = '0' -- must be repeatedAND d->>'value' <> '1' -- here we can rule out
Voilá.
Special index
If your table is huge, index size may be a deciding factor. You could compare performance of this special solution with a functional index:
This function extracts a Postgres array of oid-instance combinations from a given jsonb
value:
CREATE OR REPLACE FUNCTION f_config_json2arr(_j jsonb) RETURNS text[] LANGUAGE sql IMMUTABLE AS$func$SELECT ARRAY( SELECT (elem->>'oid') || '-' || (elem->>'instance') FROM jsonb_array_elements(_j) elem )$func$
We can build a functional index based on this:
CREATE INDEX configuration_conrfig_special_idx ON configurationUSING gin (f_config_json2arr(config->'data'));
And base the query on it:
SELECT * FROM configurationWHERE f_config_json2arr(config->'data') @> '{1.3.6.1.4.1.7352.3.10.2.5.35.3-0}'::text[]
The idea is that the index should be substantially smaller because it only stores the combined values without keys. The array containment operator @>
itself should perform similar to the jsonb containment operator @>
. I don't expect a big difference, but I would be very interested which is faster.
Similar to the first solution in this related answer (but more specialized):
Asides:
- I would not use
oid
as column name since that's also used for internal purposes in Postgres. - If possible, I would use a plain, normalized table without JSON.