Optimal way to compute pairwise mutual information using numpy

python performance numpy scipy information-theory

I can't suggest a faster calculation for the outer loop over the n*(n-1)/2vectors, but your implementation of calc_MI(x, y, bins) can be simplifiedif you can use scipy version 0.13 or scikit-learn.

In scipy 0.13, the lambda_ argument was added to scipy.stats.chi2_contingencyThis argument controls the statistic that is computed by the function. Ifyou use lambda_="log-likelihood" (or lambda_=0), the log-likelihood ratiois returned. This is also often called the G or G² statistic. Other thana factor of 2*n (where n is the total number of samples in the contingencytable), this is the mutual information. So you could implement calc_MIas:

from scipy.stats import chi2_contingencydef calc_MI(x, y, bins):    c_xy = np.histogram2d(x, y, bins)[0]    g, p, dof, expected = chi2_contingency(c_xy, lambda_="log-likelihood")    mi = 0.5 * g / c_xy.sum()    return mi

The only difference between this and your implementation is that thisimplementation uses the natural logarithm instead of the base-2 logarithm(so it is expressing the information in "nats" instead of "bits"). Ifyou really prefer bits, just divide mi by log(2).

If you have (or can install) sklearn (i.e. scikit-learn), you can usesklearn.metrics.mutual_info_score, and implement calc_MI as:

from sklearn.metrics import mutual_info_scoredef calc_MI(x, y, bins):    c_xy = np.histogram2d(x, y, bins)[0]    mi = mutual_info_score(None, None, contingency=c_xy)    return mi

CodeHunter

Optimal way to compute pairwise mutual information using numpy

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last