Python pandas dataframe: find max for each unique values of an another column
Sample data (note that you posted an image which can't be used by potential answerers without retyping, so I'm making a simple example in its place):
df=pd.DataFrame({ 'id':[1,1,1,1,2,2,2,2], 'a':range(8), 'b':range(8,0,-1) })
The key to this is just using idxmax
and idxmin
and then futzing with the indexes so that you can merge things in a readable way. Here's the whole answer and you may wish to examine intermediate dataframes to see how this is working.
df_max = df.groupby('id').idxmax()df_max['type'] = 'max'df_min = df.groupby('id').idxmin()df_min['type'] = 'min'df2 = df_max.append(df_min).set_index('type',append=True).stack().rename('index')df3 = pd.concat([ df2.reset_index().drop('id',axis=1).set_index('index'), df.loc[df2.values] ], axis=1 )df3.set_index(['id','level_2','type']).sort_index() a bid level_2 type 1 a max 3 5 min 0 8 b max 0 8 min 3 52 a max 7 1 min 4 4 b max 4 4 min 7 1
Note in particular that df2 looks like this:
id type 1 max a 3 b 02 max a 7 b 41 min a 0 b 32 min a 4 b 7
The last column there holds the index values in df
that were derived with idxmax
& idxmin
. So basically all the information you need is in df2
. The rest of it is just a matter of merging back with df
and making it more readable.