Is there something like a zip() function in PostgreSQL that combines two arrays? Is there something like a zip() function in PostgreSQL that combines two arrays? arrays arrays

Is there something like a zip() function in PostgreSQL that combines two arrays?


Postgres 9.5 or later

has array_agg(array expression):

array_agg ( anyarray ) → anyarray

Concatenates all the input arrays into an array of one higherdimension. (The inputs must all have the same dimensionality, andcannot be empty or null.)

This is a drop-in replacement for my custom aggregate function array_agg_mult() demonstrated below. It's implemented in C and considerably faster. Use it.

Postgres 9.4

Use the ROWS FROM construct or the updated unnest() which takes multiple arrays to unnest in parallel. Each can have a different length. You get (per documentation):

[...] the number of result rows in this case is that of the largest functionresult, with smaller results padded with null values to match.

Use this cleaner and simpler variant:

SELECT ARRAY[a,b] AS abFROM   unnest('{a,b,c}'::text[]             , '{d,e,f}'::text[]) x(a,b);

Postgres 9.3 or older

Simple zip()

Consider the following demo for Postgres 9.3 or earlier:

SELECT ARRAY[a,b] AS abFROM  (   SELECT unnest('{a,b,c}'::text[]) AS a        , unnest('{d,e,f}'::text[]) AS b    ) x;

Result:

  ab------- {a,d} {b,e} {c,f}

Note that both arrays must have the same number of elements to unnest in parallel, or you get a cross join instead.

You can wrap this into a function, if you want to:

CREATE OR REPLACE FUNCTION zip(anyarray, anyarray)  RETURNS SETOF anyarray LANGUAGE SQL AS$func$SELECT ARRAY[a,b] FROM (SELECT unnest($1) AS a, unnest($2) AS b) x;$func$;

Call:

SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]);

Same result.

zip() to multi-dimensional array:

Now, if you want to aggregate that new set of arrays into one 2-dimenstional array, it gets more complicated.

SELECT ARRAY (SELECT ...)

or:

SELECT array_agg(ARRAY[a,b]) AS abFROM  (   SELECT unnest('{a,b,c}'::text[]) AS a         ,unnest('{d,e,f}'::text[]) AS b    ) x

or:

SELECT array_agg(ARRAY[ARRAY[a,b]]) AS abFROM  ...

will all result in the same error message (tested with pg 9.1.5):

ERROR: could not find array type for data type text[]

But there is a way around this, as we worked out under this closely related question.
Create a custom aggregate function:

CREATE AGGREGATE array_agg_mult (anyarray) (   SFUNC    = array_cat , STYPE    = anyarray , INITCOND = '{}');

And use it like this:

SELECT array_agg_mult(ARRAY[ARRAY[a,b]]) AS abFROM  (   SELECT unnest('{a,b,c}'::text[]) AS a        , unnest('{d,e,f}'::text[]) AS b    ) x

Result:

{{a,d},{b,e},{c,f}}

Note the additional ARRAY[] layer! Without it and just:

SELECT array_agg_mult(ARRAY[a,b]) AS abFROM ...

You get:

{a,d,b,e,c,f}

Which may be useful for other purposes.

Roll another function:

CREATE OR REPLACE FUNCTION zip2(anyarray, anyarray)  RETURNS SETOF anyarray LANGUAGE SQL AS$func$SELECT array_agg_mult(ARRAY[ARRAY[a,b]])FROM (SELECT unnest($1) AS a, unnest($2) AS b) x;$func$;

Call:

SELECT zip2('{a,b,c}'::text[],'{d,e,f}'::text[]); -- or any other array type

Result:

{{a,d},{b,e},{c,f}}


Here's another approach that's safe for arrays of differing lengths, using the array multi-aggregation mentioned by Erwin:

CREATE OR REPLACE FUNCTION zip(array1 anyarray, array2 anyarray) RETURNS text[]AS $$SELECT array_agg_mult(ARRAY[ARRAY[array1[i],array2[i]]])FROM generate_subscripts(  CASE WHEN array_length(array1,1) >= array_length(array2,1) THEN array1 ELSE array2 END,  1) AS subscripts(i)$$ LANGUAGE sql;regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f}'::text[]);         zip         --------------------- {{a,d},{b,e},{c,f}}(1 row)regress=> SELECT zip('{a,b,c}'::text[],'{d,e,f,g}'::text[]);             zip              ------------------------------ {{a,d},{b,e},{c,f},{NULL,g}}(1 row)regress=> SELECT zip('{a,b,c,z}'::text[],'{d,e,f}'::text[]);             zip              ------------------------------ {{a,d},{b,e},{c,f},{z,NULL}}(1 row)

If you want to chop off the excess rather than null-padding, just change the >= length test to <= instead.

This function does not handle the rather bizarre PostgreSQL feature that arrays may have a stating element other than 1, but in practice nobody actually uses that feature. Eg with a zero-indexed 3-element array:

regress=> SELECT zip('{a,b,c}'::text[], array_fill('z'::text, ARRAY[3], ARRAY[0]));          zip           ------------------------ {{a,z},{b,z},{c,NULL}}(1 row)

wheras Erwin's code does work with such arrays, and even with multi-dimensional arrays (by flattening them) but does not work with arrays of differing length.

Arrays are a bit special in PostgreSQL, they're a little too flexible with multi-dimensional arrays, configurable origin index, etc.

In 9.4 you'll be able to write:

SELECT array_agg_mult(ARRAY[ARRAY[a,b])FROM unnest(array1) WITH ORDINALITY as (o,a)NATURAL FULL OUTER JOINunnest(array2) WITH ORDINALITY as (o,b);

which will be a lot nicer, especially if an optimisation to scan the functions together rather than doing a sort and join goes in.