Generating Discrete random variables with specified weights using SciPy or NumPy

python random numpy scipy

Drawing from a discrete distribution is directly built into numpy.The function is called random.choice (difficult to find without any reference to discrete distributions in the numpy docs).

elements = [1.1, 2.2, 3.3]probabilities = [0.2, 0.5, 0.3]np.random.choice(elements, 10, p=probabilities)

python random numpy scipy

Here is a short, relatively simple function that returns weighted values, it uses NumPy's digitize, accumulate, and random_sample.

import numpy as npfrom numpy.random import random_sampledef weighted_values(values, probabilities, size):    bins = np.add.accumulate(probabilities)    return values[np.digitize(random_sample(size), bins)]values = np.array([1.1, 2.2, 3.3])probabilities = np.array([0.2, 0.5, 0.3])print weighted_values(values, probabilities, 10)#Sample output:[ 2.2  2.2  1.1  2.2  2.2  3.3  3.3  2.2  3.3  3.3]

It works like this:

First using accumulate we create bins.
Then we create a bunch of random numbers (between 0, and 1) using random_sample
We use digitize to see which bins these numbers fall into.
And return the corresponding values.

python random numpy scipy

You were going in a good direction: the built-in scipy.stats.rv_discrete() quite directly creates a discrete random variable. Here is how it works:

>>> from scipy.stats import rv_discrete  >>> values = numpy.array([1.1, 2.2, 3.3])>>> probabilities = [0.2, 0.5, 0.3]>>> distrib = rv_discrete(values=(range(len(values)), probabilities))  # This defines a Scipy probability distribution>>> distrib.rvs(size=10)  # 10 samples from range(len(values))array([1, 2, 0, 2, 2, 0, 2, 1, 0, 2])>>> values[_]  # Conversion to specific discrete values (the fact that values is a NumPy array is used for the indexing)[2.2, 3.3, 1.1, 3.3, 3.3, 1.1, 3.3, 2.2, 1.1, 3.3]

The distribution distrib above thus returns indexes from the values list.

More generally, rv_discrete() takes a sequence of integer values in the first elements of its values=(…,…) argument, and returns these values, in this case; there is no need to convert to specific (float) values. Here is an example:

>>> values = [10, 20, 30]>>> probabilities = [0.2, 0.5, 0.3]>>> distrib = rv_discrete(values=(values, probabilities))>>> distrib.rvs(size=10)array([20, 20, 20, 20, 20, 20, 20, 30, 20, 20])

where (integer) input values are directly returned with the desired probability.

CodeHunter

Generating Discrete random variables with specified weights using SciPy or NumPy

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last