How to get the number of the most frequent value in a column?
To continue to @jonathanrocher answer you could use mode
in pandas DataFrame. It'll give a most frequent values (one or two) across the rows or columns:
import pandas as pdimport numpy as npdf = pd.DataFrame({"a": [1,2,2,4,2], "b": [np.nan, np.nan, np.nan, 3, 3]})In [2]: df.mode()Out[2]: a b0 2 3.0
You may also consider using scipy's mode
function which ignores NaN. A solution using it could look like:
from scipy.stats import modefrom numpy import nandf = DataFrame({"a": [1,2,2,4,2], "b": [nan, nan, nan, 3, 3]})print mode(df)
The output would look like
(array([[ 2., 3.]]), array([[ 3., 2.]]))
meaning that the most common values are 2
for the first columns and 3
for the second, with frequencies 3
and 2
respectively.