Python - Drop duplicate based on max value of a column Python - Drop duplicate based on max value of a column pandas pandas

Python - Drop duplicate based on max value of a column


You need DataFrameGroupBy.idxmax for indexes of max value of value3 and thes select DataFrame by loc:

print (df.groupby(['id1','id2','value1']).value3.idxmax())id1  id2  value11    2    30        13    5    12        424   12   1         6Name: value3, dtype: int64df = df.loc[df.groupby(['id1','id2','value1']).value3.idxmax()]print (df)   id1  id2  value1  value2  value3   a1    1    2      30      42    26.2 NaN4    3    5      12      33    11.2 NaN6   24   12       1      23     1.9 NaN

Another possible solution is sort_values by column value3 and then groupby with GroupBy.first:

df = df.sort_values('value3', ascending=False)       .groupby(['id1','id2','value1'], sort=False)       .first()       .reset_index()print (df)   id1  id2  value1  value2  value3   a0    1    2      30      42    26.2 NaN1    3    5      12      33    11.2 NaN2   24   12       1      23     1.9 NaN