Is there a fast alternative to scipy _norm_pdf for correlated distribution sampling? Is there a fast alternative to scipy _norm_pdf for correlated distribution sampling? numpy numpy

Is there a fast alternative to scipy _norm_pdf for correlated distribution sampling?


If you need just the normal ppf, it is indeed puzzling that it is so slow, but you can use scipy.special.erfinv instead:

x = np.random.uniform(0,1,100)np.allclose(special.erfinv(2*x-1)*np.sqrt(2),stats.norm().ppf(x))# Truetimeit(lambda:stats.norm().ppf(x),number=1000)# 0.7717257660115138timeit(lambda:special.erfinv(2*x-1)*np.sqrt(2),number=1000)# 0.015020604943856597

EDIT:

lognormal and triangle are also straight forward:

c = np.random.uniform()np.allclose(np.exp(c*special.erfinv(2*x-1)*np.sqrt(2)),stats.lognorm(c).ppf(x))# Truenp.allclose(((1-np.sqrt(1-(x-c)/((x>c)-c)))*((x>c)-c))+c,stats.triang(c).ppf(x))# True

skew normal I'm not familiar enough, unfortunately.


Ultimately, this issue was caused by my use of the skew-normal distribution. The ppf of the skew-normal actually does not have a closed-form analytic definition, so in order to compute the ppf, it fell back to scipy.continuous_rv's numeric approximation, which involved iteratively computing the cdf and using that to zero in on the ppf value. The skew-normal pdf is the product of the normal pdf and normal cdf, so this numeric approximation called the normal's pdf and cdf many many times. This is why when I profiled the code, it looked like the normal distribution was the problem, not the SKU normal. The other answer to this question was able to achieve time savings by skipping type-checking, but didn't actually make a difference on the run-time growth, just a difference on small-n runtimes.

To solve this problem, I have replaced the skew-normal distribution with the Johnson SU distribution. It has 2 more free parameters than a normal distribution so it can fit different types of skew and kurtosis effectively. It's supported for all real numbers, and it has a closed-form ppf definition with a fast implementation in SciPy. Below you can see example Johnson SU distributions I've been fitting from the 10th, 50th, and 90th percentiles.

Example distributions