Plotting histograms from grouped data in a pandas DataFrame Plotting histograms from grouped data in a pandas DataFrame python python

Plotting histograms from grouped data in a pandas DataFrame


I'm on a roll, just found an even simpler way to do it using the by keyword in the hist method:

df['N'].hist(by=df['Letter'])

That's a very handy little shortcut for quickly scanning your grouped data!

For future visitors, the product of this call is the following chart:enter image description here


Your function is failing because the groupby dataframe you end up with has a hierarchical index and two columns (Letter and N) so when you do .hist() it's trying to make a histogram of both columns hence the str error.

This is the default behavior of pandas plotting functions (one plot per column) so if you reshape your data frame so that each letter is a column you will get exactly what you want.

df.reset_index().pivot('index','Letter','N').hist()

The reset_index() is just to shove the current index into a column called index. Then pivot will take your data frame, collect all of the values N for each Letter and make them a column. The resulting data frame as 400 rows (fills missing values with NaN) and three columns (A, B, C). hist() will then produce one histogram per column and you get format the plots as needed.


One solution is to use matplotlib histogram directly on each grouped data frame. You can loop through the groups obtained in a loop. Each group is a dataframe. And you can create a histogram for each one.

from pandas import DataFrameimport numpy as npx = ['A']*300 + ['B']*400 + ['C']*300y = np.random.randn(1000)df = DataFrame({'Letter':x, 'N':y})grouped = df.groupby('Letter')for group in grouped:  figure()  matplotlib.pyplot.hist(group[1].N)  show()