Groupby names replace values with there max value in all columns pandas
Try using pd.wide_to_long
to melt that dataframe into a long form, then use groupby with transform to find the max value. Map that max value to 'name' and reshape back to four column (wide) dataframe:
df_long = pd.wide_to_long(df.reset_index(), ['name','val'], 'index', j='num',sep='',suffix='\d+')mapper= df_long.groupby('name')['val'].max()df_long['val'] = df_long['name'].map(mapper)df_new = df_long.unstack()df_new.columns = [f'{i}{j}' for i,j in df_new.columns]df_new
Output:
name1 name2 val1 val2index 0 AAA BBB 31 221 BBB AAA 22 312 BBB CCC 22 153 CCC AAA 15 314 DDD EEE 25 35
Borrow Scott's setting up
df_long = pd.wide_to_long(df.reset_index(), ['name','val'], 'index', j='num',sep='',suffix='\d+')d = df_long.groupby('name')['val'].max()df.loc[:,df.columns.str.startswith('val')]=df.loc[:,df.columns.str.startswith('name')].replace(d).valuesdfOut[196]: name1 val1 name2 val20 AAA 31 BBB 221 BBB 22 AAA 312 BBB 22 CCC 153 CCC 15 AAA 314 DDD 25 EEE 35
You can use lreshape
(undocumented and ambiguous as to whether it's tested or will continue to remain) to get the long DataFrame, then map each pair of columns using the max.
names = df.columns[df.columns.str.startswith('name')]vals = df.columns[df.columns.str.startswith('val')]s = (pd.lreshape(df, groups={'name': names, 'val': vals}) .groupby('name')['val'].max())for n in names: df[n.replace('name', 'val')] = df[n].map(s)
name1 val1 name2 val20 AAA 31 BBB 221 BBB 22 AAA 312 BBB 22 CCC 153 CCC 15 AAA 314 DDD 25 EEE 35