Choose list variable given probability of each variable Choose list variable given probability of each variable python python

Choose list variable given probability of each variable


You can easily achieve this with numpy. It has a choice function which accepts the parameter of probabilities.

np.random.choice(  ['pooh', 'rabbit', 'piglet', 'Christopher'],   5,  p=[0.5, 0.1, 0.1, 0.3])


Basically, make a cumulative probability distribution (CDF) array. Basically, the value of the CDF for a given index is equal to the sum of all values in P equal to or less than that index. Then you generate a random number between 0 and 1 and do a binary search (or linear search if you want). Here's some simple code for it.

from bisect import bisectfrom random import randomP = [0.10,0.25,0.60,0.05]cdf = [P[0]]for i in xrange(1, len(P)):    cdf.append(cdf[-1] + P[i])random_ind = bisect(cdf,random())

of course you can generate a bunch of random indices with something like

rs = [bisect(cdf, random()) for i in xrange(20)]

yielding

[2, 2, 3, 2, 2, 1, 2, 2, 2, 1, 2, 1, 2, 1, 2, 1, 2, 2, 2, 2]

(results will, and should vary). Of course, binary search is rather unnecessary for so few of possible indices, but definitely recommended for distributions with more possible indices.


Hmm interesting, how about...

  1. Generate a number between 0 and 1.

  2. Walk the list substracting the probability of each item from your number.

  3. Pick the item that, after substraction, took your number down to 0 or below.

That's simple, O(n) and should work :)