Plotting log-binned network degree distributions Plotting log-binned network degree distributions numpy numpy

Plotting log-binned network degree distributions


Use log binning (see also). Here is code to take a Counter object representing a histogram of degree values and log-bin the distribution to produce a sparser and smoother distribution.

import numpy as npdef drop_zeros(a_list):    return [i for i in a_list if i>0]def log_binning(counter_dict,bin_count=35):    max_x = log10(max(counter_dict.keys()))    max_y = log10(max(counter_dict.values()))    max_base = max([max_x,max_y])    min_x = log10(min(drop_zeros(counter_dict.keys())))    bins = np.logspace(min_x,max_base,num=bin_count)    # Based off of: http://stackoverflow.com/questions/6163334/binning-data-in-python-with-scipy-numpy    bin_means_y = (np.histogram(counter_dict.keys(),bins,weights=counter_dict.values())[0] / np.histogram(counter_dict.keys(),bins)[0])    bin_means_x = (np.histogram(counter_dict.keys(),bins,weights=counter_dict.keys())[0] / np.histogram(counter_dict.keys(),bins)[0])    return bin_means_x,bin_means_y

Generating a classic scale-free network in NetworkX and then plotting this:

import networkx as nxba_g = nx.barabasi_albert_graph(10000,2)ba_c = nx.degree_centrality(ba_g)# To convert normalized degrees to raw degrees#ba_c = {k:int(v*(len(ba_g)-1)) for k,v in ba_c.iteritems()}ba_c2 = dict(Counter(ba_c.values()))ba_x,ba_y = log_binning(ba_c2,50)plt.xscale('log')plt.yscale('log')plt.scatter(ba_x,ba_y,c='r',marker='s',s=50)plt.scatter(ba_c2.keys(),ba_c2.values(),c='b',marker='x')plt.xlim((1e-4,1e-1))plt.ylim((.9,1e4))plt.xlabel('Connections (normalized)')plt.ylabel('Frequency')plt.show()

Produces the following plot showing the overlap between the "raw" distribution in blue and the "binned" distribution in red.

Comparison between raw and log-binned

Thoughts on how to improve this approach or feedback if I've missed something obvious are welcome.