Get the row(s) which have the max value in groups using groupby

python pandas max pandas-groupby

In [1]: dfOut[1]:    Sp  Mt Value  count0  MM1  S1     a      31  MM1  S1     n      22  MM1  S3    cb      53  MM2  S3    mk      84  MM2  S4    bg     105  MM2  S4   dgd      16  MM4  S2    rd      27  MM4  S2    cb      28  MM4  S2   uyi      7In [2]: df.groupby(['Mt'], sort=False)['count'].max()Out[2]:MtS1     3S3     8S4    10S2     7Name: count

To get the indices of the original DF you can do:

In [3]: idx = df.groupby(['Mt'])['count'].transform(max) == df['count']In [4]: df[idx]Out[4]:    Sp  Mt Value  count0  MM1  S1     a      33  MM2  S3    mk      84  MM2  S4    bg     108  MM4  S2   uyi      7

Note that if you have multiple max values per group, all will be returned.

Update

On a hail mary chance that this is what the OP is requesting:

In [5]: df['count_max'] = df.groupby(['Mt'])['count'].transform(max)In [6]: dfOut[6]:    Sp  Mt Value  count  count_max0  MM1  S1     a      3          31  MM1  S1     n      2          32  MM1  S3    cb      5          83  MM2  S3    mk      8          84  MM2  S4    bg     10         105  MM2  S4   dgd      1         106  MM4  S2    rd      2          77  MM4  S2    cb      2          78  MM4  S2   uyi      7          7

python pandas max pandas-groupby

You can sort the dataFrame by count and then remove duplicates. I think it's easier:

df.sort_values('count', ascending=False).drop_duplicates(['Sp','Mt'])

python pandas max pandas-groupby

Easy solution would be to apply : idxmax() function to get indices of rows with max values. This would filter out all the rows with max value in the group.

In [365]: import pandas as pdIn [366]: df = pd.DataFrame({'sp' : ['MM1', 'MM1', 'MM1', 'MM2', 'MM2', 'MM2', 'MM4', 'MM4','MM4'],'mt' : ['S1', 'S1', 'S3', 'S3', 'S4', 'S4', 'S2', 'S2', 'S2'],'val' : ['a', 'n', 'cb', 'mk', 'bg', 'dgb', 'rd', 'cb', 'uyi'],'count' : [3,2,5,8,10,1,2,2,7]})In [367]: df                                                                                                       Out[367]:    count  mt   sp  val0      3  S1  MM1    a1      2  S1  MM1    n2      5  S3  MM1   cb3      8  S3  MM2   mk4     10  S4  MM2   bg5      1  S4  MM2  dgb6      2  S2  MM4   rd7      2  S2  MM4   cb8      7  S2  MM4  uyi### Apply idxmax() and use .loc() on dataframe to filter the rows with max values:In [368]: df.loc[df.groupby(["sp", "mt"])["count"].idxmax()]                                                       Out[368]:    count  mt   sp  val0      3  S1  MM1    a2      5  S3  MM1   cb3      8  S3  MM2   mk4     10  S4  MM2   bg8      7  S2  MM4  uyi### Just to show what values are returned by .idxmax() above:In [369]: df.groupby(["sp", "mt"])["count"].idxmax().values                                                        Out[369]: array([0, 2, 3, 4, 8])

CodeHunter

Get the row(s) which have the max value in groups using groupby

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last