Get mean and mode of dataframe depending on each column type
You can use describe
with 'all'
which calculates statistics depending upon the dtype
. It determines the top
(mode) for object and mean
for numeric columns. Then combine.
s = df1.describe(include='all')s = s.loc['top'].combine_first(s.loc['mean'])#Group Winner#Study Read#Score 0.883333#Name: top, dtype: object
np.number
and select_dtypes
s = df1.select_dtypes(np.number).mean()df1.drop(s.index, axis=1).mode().iloc[0].append(s)Group WinnerStudy ReadScore 0.883333dtype: object
Variant
g = df1.dtypes.map(lambda x: np.issubdtype(x, np.number))d = {k: d for k, d in df1.groupby(g, axis=1)}pd.concat([d[False].mode().iloc[0], d[True].mean()])Group WinnerStudy ReadScore 0.883333dtype: object
Here is a slight variation on your solution that gets the job done
res = {}for col_name, col_type in zip(df1.columns, df1.dtypes): if pd.api.types.is_numeric_dtype(col_type): res[col_name] = df1[col_name].mean() else: res[col_name]= df1[col_name].mode()[0]pd.DataFrame(res, index = [0])
returns
Group Study Score0 Winner Read 0.883333
there could be multiple mode
s in a Series -- this solution picks the first one