Is there a function to make scatterplot matrices in matplotlib? Is there a function to make scatterplot matrices in matplotlib? python python

Is there a function to make scatterplot matrices in matplotlib?


For those who do not want to define their own functions, there is a great data analysis libarary in Python, called Pandas, where one can find the scatter_matrix() method:

from pandas.plotting import scatter_matrixdf = pd.DataFrame(np.random.randn(1000, 4), columns = ['a', 'b', 'c', 'd'])scatter_matrix(df, alpha = 0.2, figsize = (6, 6), diagonal = 'kde')

enter image description here


Generally speaking, matplotlib doesn't usually contain plotting functions that operate on more than one axes object (subplot, in this case). The expectation is that you'd write a simple function to string things together however you'd like.

I'm not quite sure what your data looks like, but it's quite simple to just build a function to do this from scratch. If you're always going to be working with structured or rec arrays, then you can simplify this a touch. (i.e. There's always a name associated with each data series, so you can omit having to specify names.)

As an example:

import itertoolsimport numpy as npimport matplotlib.pyplot as pltdef main():    np.random.seed(1977)    numvars, numdata = 4, 10    data = 10 * np.random.random((numvars, numdata))    fig = scatterplot_matrix(data, ['mpg', 'disp', 'drat', 'wt'],            linestyle='none', marker='o', color='black', mfc='none')    fig.suptitle('Simple Scatterplot Matrix')    plt.show()def scatterplot_matrix(data, names, **kwargs):    """Plots a scatterplot matrix of subplots.  Each row of "data" is plotted    against other rows, resulting in a nrows by nrows grid of subplots with the    diagonal subplots labeled with "names".  Additional keyword arguments are    passed on to matplotlib's "plot" command. Returns the matplotlib figure    object containg the subplot grid."""    numvars, numdata = data.shape    fig, axes = plt.subplots(nrows=numvars, ncols=numvars, figsize=(8,8))    fig.subplots_adjust(hspace=0.05, wspace=0.05)    for ax in axes.flat:        # Hide all ticks and labels        ax.xaxis.set_visible(False)        ax.yaxis.set_visible(False)        # Set up ticks only on one side for the "edge" subplots...        if ax.is_first_col():            ax.yaxis.set_ticks_position('left')        if ax.is_last_col():            ax.yaxis.set_ticks_position('right')        if ax.is_first_row():            ax.xaxis.set_ticks_position('top')        if ax.is_last_row():            ax.xaxis.set_ticks_position('bottom')    # Plot the data.    for i, j in zip(*np.triu_indices_from(axes, k=1)):        for x, y in [(i,j), (j,i)]:            axes[x,y].plot(data[x], data[y], **kwargs)    # Label the diagonal subplots...    for i, label in enumerate(names):        axes[i,i].annotate(label, (0.5, 0.5), xycoords='axes fraction',                ha='center', va='center')    # Turn on the proper x or y axes ticks.    for i, j in zip(range(numvars), itertools.cycle((-1, 0))):        axes[j,i].xaxis.set_visible(True)        axes[i,j].yaxis.set_visible(True)    return figmain()

enter image description here


You can also use Seaborn's pairplot function:

import seaborn as snssns.set()df = sns.load_dataset("iris")sns.pairplot(df, hue="species")