Are these functions equivalent?

python numpy tensorflow distribution

I would say they are, as their sampling is defined in almost the exact same way in both cases. This is how the sampling of tf.distributions.StudentT is defined:

def _sample_n(self, n, seed=None):  # The sampling method comes from the fact that if:  #   X ~ Normal(0, 1)  #   Z ~ Chi2(df)  #   Y = X / sqrt(Z / df)  # then:  #   Y ~ StudentT(df).  seed = seed_stream.SeedStream(seed, "student_t")  shape = tf.concat([[n], self.batch_shape_tensor()], 0)  normal_sample = tf.random.normal(shape, dtype=self.dtype, seed=seed())  df = self.df * tf.ones(self.batch_shape_tensor(), dtype=self.dtype)  gamma_sample = tf.random.gamma([n],                                 0.5 * df,                                 beta=0.5,                                 dtype=self.dtype,                                 seed=seed())  samples = normal_sample * tf.math.rsqrt(gamma_sample / df)  return samples * self.scale + self.loc  # Abs(scale) not wanted.

So it is a standard normal sample divided by the square root of a chi-square sample with parameter df divided by df. The chi-square sample is taken as a gamma sample with parameter 0.5 * df and rate 0.5, which is equivalent (chi-square is a special case of gamma). The scale value, like the loc, only comes into play in the last line, as a way to "relocate" the distribution sample at some point and scale. When scale is one and loc is zero, they do nothing.

Here is the implementation for np.random.standard_t:

double legacy_standard_t(aug_bitgen_t *aug_state, double df) {  double num, denom;  num = legacy_gauss(aug_state);  denom = legacy_standard_gamma(aug_state, df / 2);  return sqrt(df / 2) * num / sqrt(denom);})

So essentially the same thing, slightly rephrased. Here we have also have a gamma with shape df / 2 but it is standard (rate one). However, the missing 0.5 is now by the numerator as / 2 within the sqrt. So it's just moving the numbers around. Here there is no scale or loc, though.

In truth, the difference is that in the case of TensorFlow the distribution really is a noncentral t-distribution. A simple empirical proof that they are the same for loc=0.0 and scale=1.0 is to plot histograms for both distributions and see how close they look.

import numpy as npimport tensorflow as tfimport matplotlib.pyplot as pltnp.random.seed(0)t_np = np.random.standard_t(df=3, size=10000)with tf.Graph().as_default(), tf.Session() as sess:    tf.random.set_random_seed(0)    t_dist = tf.distributions.StudentT(df=3.0, loc=0.0, scale=1.0)    t_tf = sess.run(t_dist.sample(10000))plt.hist((t_np, t_tf), np.linspace(-10, 10, 20), label=['NumPy', 'TensorFlow'])plt.legend()plt.tight_layout()plt.show()

Output:

That looks pretty close. Obviously, from the point of view of statistical samples, this is not any kind of proof. If you were not still convinced, there are some statistical tools for testing whether a sample comes from a certain distribution or two samples come from the same distribution.

CodeHunter

Are these functions equivalent?

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last