Pandas groupby and filter
I think groupby
is not necessary, use boolean indexing
only if need all rows where V
is 0
:
print (df[df.V == 0]) C ID V YEAR0 0 1 0 20113 33 2 0 20135 55 3 0 2014
But if need return all groups where is at least one value of column V
equal 0
add any
, because filter need True
or False
for filtering all rows in group:
print(df.groupby(['ID']).filter(lambda x: (x['V'] == 0).any())) C ID V YEAR0 0 1 0 20111 11 1 1 20122 22 2 1 20123 33 2 0 20134 44 3 1 20135 55 3 0 2014
Better for testing is change column for groupby
- row with 2012
is filter out because no V==0
:
print(df.groupby(['YEAR']).filter(lambda x: (x['V'] == 0).any())) C ID V YEAR0 0 1 0 20113 33 2 0 20134 44 3 1 20135 55 3 0 2014
If performance is important use GroupBy.transform
with boolean indexing
:
print(df[(df['V'] == 0).groupby(df['YEAR']).transform('any')]) ID YEAR V C0 1 2011 0 03 2 2013 0 334 3 2013 1 445 3 2014 0 55
Detail:
print((df['V'] == 0).groupby(df['YEAR']).transform('any')) 0 True1 False2 False3 True4 True5 TrueName: V, dtype: bool