Filtering pandas dataframe with multiple Boolean columns Filtering pandas dataframe with multiple Boolean columns numpy numpy

Filtering pandas dataframe with multiple Boolean columns


In [82]: dOut[82]:             A   B      C      D0     John Doe  45   True  False1   Jane Smith  32  False  False2  Alan Holmes  55  False   True3   Eric Lamar  29   True   True

Solution 1:

In [83]: d.loc[d.C | d.D]Out[83]:             A   B      C      D0     John Doe  45   True  False2  Alan Holmes  55  False   True3   Eric Lamar  29   True   True

Solution 2:

In [94]: d[d[['C','D']].any(1)]Out[94]:             A   B      C      D0     John Doe  45   True  False2  Alan Holmes  55  False   True3   Eric Lamar  29   True   True

Solution 3:

In [95]: d.query("C or D")Out[95]:             A   B      C      D0     John Doe  45   True  False2  Alan Holmes  55  False   True3   Eric Lamar  29   True   True

PS If you change your solution to:

df[(df['C']==True) | (df['D']==True)]

it'll work too

Pandas docs - boolean indexing


why we should NOT use "PEP complaint" df["col_name"] is True instead of df["col_name"] == True?

In [11]: df = pd.DataFrame({"col":[True, True, True]})In [12]: dfOut[12]:    col0  True1  True2  TrueIn [13]: df["col"] is TrueOut[13]: False               # <----- oops, that's not exactly what we wanted


Hooray! More options!

np.where

df[np.where(df.C | df.D, True, False)]             A   B      C      D0     John Doe  45   True  False2  Alan Holmes  55  False   True3   Eric Lamar  29   True   True  

pd.Series.where on df.index

df.loc[df.index.where(df.C | df.D).dropna()]               A   B      C      D0.0     John Doe  45   True  False2.0  Alan Holmes  55  False   True3.0   Eric Lamar  29   True   True

df.select_dtypes

df[df.select_dtypes([bool]).any(1)]                A   B      C      D0     John Doe  45   True  False2  Alan Holmes  55  False   True3   Eric Lamar  29   True   True

Abusing np.select

df.iloc[np.select([df.C | df.D], [df.index])].drop_duplicates()             A   B      C      D0     John Doe  45   True  False2  Alan Holmes  55  False   True3   Eric Lamar  29   True   True


Or

d[d.eval('C or D')]Out[1065]:             A   B      C      D0     John Doe  45   True  False2  Alan Holmes  55  False   True3   Eric Lamar  29   True   True