How to get the most frequent row in table How to get the most frequent row in table numpy numpy

How to get the most frequent row in table


Check groupby

df.groupby(df.columns.tolist()).size().sort_values().tail(1).reset_index().drop(0,1)   col_1  col_2 col_3  0      1      1     A  


With NumPy's np.unique -

In [92]: u,idx,c = np.unique(df.values.astype(str), axis=0, return_index=True, return_counts=True)In [99]: df.iloc[[idx[c.argmax()]]]Out[99]:    col_1  col_2 col_30      1      1     A

If you are looking for performance, convert the string column to numeric and then use np.unique -

a = np.c_[df.col_1, df.col_2, pd.factorize(df.col_3)[0]]u,idx,c = np.unique(a, axis=0, return_index=True, return_counts=True)


You can do this with groupby and size:

df = df.groupby(df.columns.tolist(),as_index=False).size()result = df.iloc[[df["size"].idxmax()]].drop(["size"], axis=1)result.reset_index(drop=True) #this is just to reset the index