Is there something like a zip() function in PostgreSQL that combines two arrays?
Postgres 9.5 or later
has array_agg(array expression)
:
array_agg
(anyarray
) →anyarray
Concatenates all the input arrays into an array of one higherdimension. (The inputs must all have the same dimensionality, andcannot be empty or null.)
This is a drop-in replacement for my custom aggregate function array_agg_mult()
demonstrated below. It's implemented in C and considerably faster. Use it.
Postgres 9.4
Use the ROWS FROM
construct or the updated unnest()
which takes multiple arrays to unnest in parallel. Each can have a different length. You get (per documentation):
[...] the number of result rows in this case is that of the largest functionresult, with smaller results padded with null values to match.
Use this cleaner and simpler variant:
SELECT ARRAY[a,b] AS abFROM unnest('{a,b,c}'::text[] , '{d,e,f}'::text[]) x(a,b);
Postgres 9.3 or older
Simple zip()
Consider the following demo for Postgres 9.3 or earlier:
SELECT ARRAY[a,b] AS abFROM ( SELECT unnest('{a,b,c}'::text[]) AS a , unnest('{d,e,f}'::text[]) AS b ) x;
Result:
ab------- {a,d} {b,e} {c,f}
Note that both arrays must have the same number of elements to unnest in parallel, or you get a cross join instead.
You can wrap this into a function, if you want to:
CREATE OR REPLACE FUNCTION zip(anyarray, anyarray) RETURNS SETOF anyarray LANGUAGE SQL AS$func$SELECT ARRAY[a,b] FROM (SELECT unnest($1) AS a, unnest($2) AS b) x;$func$;
Call:
SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]);
Same result.
zip() to multi-dimensional array:
Now, if you want to aggregate that new set of arrays into one 2-dimenstional array, it gets more complicated.
SELECT ARRAY (SELECT ...)
or:
SELECT array_agg(ARRAY[a,b]) AS abFROM ( SELECT unnest('{a,b,c}'::text[]) AS a ,unnest('{d,e,f}'::text[]) AS b ) x
or:
SELECT array_agg(ARRAY[ARRAY[a,b]]) AS abFROM ...
will all result in the same error message (tested with pg 9.1.5):
ERROR: could not find array type for data type text[]
But there is a way around this, as we worked out under this closely related question.
Create a custom aggregate function:
CREATE AGGREGATE array_agg_mult (anyarray) ( SFUNC = array_cat , STYPE = anyarray , INITCOND = '{}');
And use it like this:
SELECT array_agg_mult(ARRAY[ARRAY[a,b]]) AS abFROM ( SELECT unnest('{a,b,c}'::text[]) AS a , unnest('{d,e,f}'::text[]) AS b ) x
Result:
{{a,d},{b,e},{c,f}}
Note the additional ARRAY[]
layer! Without it and just:
SELECT array_agg_mult(ARRAY[a,b]) AS abFROM ...
You get:
{a,d,b,e,c,f}
Which may be useful for other purposes.
Roll another function:
CREATE OR REPLACE FUNCTION zip2(anyarray, anyarray) RETURNS SETOF anyarray LANGUAGE SQL AS$func$SELECT array_agg_mult(ARRAY[ARRAY[a,b]])FROM (SELECT unnest($1) AS a, unnest($2) AS b) x;$func$;
Call:
SELECT zip2('{a,b,c}'::text[],'{d,e,f}'::text[]); -- or any other array type
Result:
{{a,d},{b,e},{c,f}}
Here's another approach that's safe for arrays of differing lengths, using the array multi-aggregation mentioned by Erwin:
CREATE OR REPLACE FUNCTION zip(array1 anyarray, array2 anyarray) RETURNS text[]AS $$SELECT array_agg_mult(ARRAY[ARRAY[array1[i],array2[i]]])FROM generate_subscripts( CASE WHEN array_length(array1,1) >= array_length(array2,1) THEN array1 ELSE array2 END, 1) AS subscripts(i)$$ LANGUAGE sql;regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]); zip --------------------- {{a,d},{b,e},{c,f}}(1 row)regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f,g}'::text[]); zip ------------------------------ {{a,d},{b,e},{c,f},{NULL,g}}(1 row)regress=> SELECT zip('{a,b,c,z}'::text[],'{d,e,f}'::text[]); zip ------------------------------ {{a,d},{b,e},{c,f},{z,NULL}}(1 row)
If you want to chop off the excess rather than null-padding, just change the >=
length test to <=
instead.
This function does not handle the rather bizarre PostgreSQL feature that arrays may have a stating element other than 1, but in practice nobody actually uses that feature. Eg with a zero-indexed 3-element array:
regress=> SELECT zip('{a,b,c}'::text[], array_fill('z'::text, ARRAY[3], ARRAY[0])); zip ------------------------ {{a,z},{b,z},{c,NULL}}(1 row)
wheras Erwin's code does work with such arrays, and even with multi-dimensional arrays (by flattening them) but does not work with arrays of differing length.
Arrays are a bit special in PostgreSQL, they're a little too flexible with multi-dimensional arrays, configurable origin index, etc.
In 9.4 you'll be able to write:
SELECT array_agg_mult(ARRAY[ARRAY[a,b])FROM unnest(array1) WITH ORDINALITY as (o,a)NATURAL FULL OUTER JOINunnest(array2) WITH ORDINALITY as (o,b);
which will be a lot nicer, especially if an optimisation to scan the functions together rather than doing a sort and join goes in.