Is there a fast alternative to scipy _norm_pdf for correlated distribution sampling?

python numpy scipy distribution montecarlo

If you need just the normal ppf, it is indeed puzzling that it is so slow, but you can use scipy.special.erfinv instead:

x = np.random.uniform(0,1,100)np.allclose(special.erfinv(2*x-1)*np.sqrt(2),stats.norm().ppf(x))# Truetimeit(lambda:stats.norm().ppf(x),number=1000)# 0.7717257660115138timeit(lambda:special.erfinv(2*x-1)*np.sqrt(2),number=1000)# 0.015020604943856597

EDIT:

lognormal and triangle are also straight forward:

c = np.random.uniform()np.allclose(np.exp(c*special.erfinv(2*x-1)*np.sqrt(2)),stats.lognorm(c).ppf(x))# Truenp.allclose(((1-np.sqrt(1-(x-c)/((x>c)-c)))*((x>c)-c))+c,stats.triang(c).ppf(x))# True

skew normal I'm not familiar enough, unfortunately.

python numpy scipy distribution montecarlo

Ultimately, this issue was caused by my use of the skew-normal distribution. The ppf of the skew-normal actually does not have a closed-form analytic definition, so in order to compute the ppf, it fell back to scipy.continuous_rv's numeric approximation, which involved iteratively computing the cdf and using that to zero in on the ppf value. The skew-normal pdf is the product of the normal pdf and normal cdf, so this numeric approximation called the normal's pdf and cdf many many times. This is why when I profiled the code, it looked like the normal distribution was the problem, not the SKU normal. The other answer to this question was able to achieve time savings by skipping type-checking, but didn't actually make a difference on the run-time growth, just a difference on small-n runtimes.

To solve this problem, I have replaced the skew-normal distribution with the Johnson SU distribution. It has 2 more free parameters than a normal distribution so it can fit different types of skew and kurtosis effectively. It's supported for all real numbers, and it has a closed-form ppf definition with a fast implementation in SciPy. Below you can see example Johnson SU distributions I've been fitting from the 10th, 50th, and 90th percentiles.

CodeHunter

Is there a fast alternative to scipy _norm_pdf for correlated distribution sampling?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last