PostgreSQL PL/pgSQL random value from array of values PostgreSQL PL/pgSQL random value from array of values arrays arrays

PostgreSQL PL/pgSQL random value from array of values


Try this one:

select (array['Yes', 'No', 'Maybe'])[floor(random() * 3 + 1)];


CREATE FUNCTION random_pick()  RETURNS int AS$func$DECLARE   a int[] := '{[0:2]=1,2,5}'; -- sample sakeBEGIN   RETURN a[floor((random()*3))::int];END$func$ LANGUAGE plpgsql VOLATILE

random() returns a value x where 0.0 <= x < 1.0. Multiply by three and floor() it to get 0, 1 or 2 with equal chance. This would be off-by-one for the default array index that starts with 1. For efficiency, I declare the array index to start with 0 instead.

The manual on these mathematical functions.


Erwin Brandstetter answered the OP's question well enough. However, for others looking for understanding how to randomly pick elements from more complex arrays (like me some two months ago), I expanded his function:

CREATE OR REPLACE FUNCTION random_pick( a anyarray, OUT x anyelement )  RETURNS anyelement AS$func$BEGIN  IF a = '{}' THEN    x := NULL::TEXT;  ELSE    WHILE x IS NULL LOOP      x := a[floor(array_lower(a, 1) + (random()*( array_upper(a, 1) -  array_lower(a, 1)+1) ) )::int];    END LOOP;  END IF;END$func$ LANGUAGE plpgsql VOLATILE RETURNS NULL ON NULL INPUT;

Few assumptions:

  • this is not only for integer arrays, but for arrays of any type

  • we ignore NULL data; NULL is returned only if the array is empty or if NULL is inserted (values of other non-array types produce an error)

  • the array don't need to be formatted as usual - the array index may start and end anywhere, may have gaps etc.

  • this is for one-dimensional arrays

Other notes:

  • without the first IF statement, empty array would lead to an endless loop

  • without the loop, gaps and NULLs would make the function return NULL

  • omit both array_lower calls if you know that your arrays start at zero

  • with gaps in the index, you will need array_upper instead of array_length; without gaps, it's the same (not sure which is faster, but they shouldn't be much different)

  • the +1 after second array_lower serves to get the last value in the array with the same probability as any other; otherwise it would need the random()'s output to be exactly 1, which never happens

  • this is considerably slower than Erwin's solution, and likely to be an overkill for the your needs; in practice, most people would mix an ideal cocktail from the two