Querying a pandas dataframe column which has values as list Querying a pandas dataframe column which has values as list pandas pandas

Querying a pandas dataframe column which has values as list


You cannot use pd.DataFrame.query to test membership of a string in lists within a series of lists. Holding lists in Pandas dataframes is not recommended as you lose vectorised functionality.

With your existing dataframe, you can instead calculate a mask using pd.Series.apply:

res = df[df['tags'].apply(lambda x: 'apple' in x)]print(res)   score                  tags0      1  [apple, pear, guava]

Or you can use a list comprehension:

res = df[['apple' in x for x in df['tags']]]

A third option is to use set:

res = df[df['tags'].apply(set) >= {'apple'}]

The last option, although expensive, may suit when you are testing for existence of multiple tags. In each case, we are constructing a Boolean series, which we then use to mask the dataframe.