Pandas normalise by column on groupby
Use GroupBy.transform
:
columns = ['x', 'y']g = df.groupby('id')[columns]df[columns] = (df[columns] - g.transform('min')) / (g.transform('max') - g.transform('min')) print (df) id x y0 id1 0.0 0.01 id1 1.0 1.02 id2 0.0 0.03 id2 1.0 1.0
It proves unclear how to update each normalised column after
df.groupby(['id']).apply(lambda x: ...)
You can apply
again:
df.groupby(["id"])\.apply(lambda id_df: id_df[columns]\ .apply(lambda serie: (serie - serie.min()) / (serie.max() - serie.min())))
Probably not the best way, but if your dataframe is not huge, then this will do:
for column in columns: for id in list_of_IDs: df.loc[df.loc[id] == i,column] = (df.loc[df.loc[id] == i,column] - df.loc[df.loc[id] == i,column].min()) / df.loc[df.loc[id] == i,column].max() - df.loc[df.loc[id] == i,column].min())