Select dataframe row from rowname using case-insensitive (like `grep -i`) Select dataframe row from rowname using case-insensitive (like `grep -i`) pandas pandas

Select dataframe row from rowname using case-insensitive (like `grep -i`)


You can do it like this:

query = 'hdgfl1'mask = df.index.to_series().str.contains(query, case=False)df[mask]

Another possibility would be:

mask = df.reset_index()['index'].str.contains(query, case=False)

but this is 2x slower.


In [229]: df.filter(regex=r'(?i)hdgfl1', axis=0)Out[229]:                           0         1         21421293_at Hdgfl1  2.140412  1.143337  3.260313


And with select():

import pandas as pdimport numpy as npimport matplotlib.pyplot as pltimport remydict = {"1421293_at Hdgfl1":[2.140412,1.143337,3.260313],"1429877_at Lrriq3":[ 9.019368,0.874524,2.051820],"1421293_at hDGFl1":[2.140412,1.143337,3.260313],}df = pd.DataFrame.from_dict(mydict, orient='index')def create_match_func(a_str):    def match_func(x):        pattern = r".* {}".format(a_str)        match_obj = re.search(pattern, x, flags=re.X|re.I)        return match_obj    return match_funcprint dfprint '-' * 20target = "hdgfl1"print df.select(create_match_func(target), axis=0)--output:--                          0         1         21421293_at Hdgfl1  2.140412  1.143337  3.2603131429877_at Lrriq3  9.019368  0.874524  2.0518201421293_at hDGFl1  2.140412  1.143337  3.260313--------------------                          0         1         21421293_at Hdgfl1  2.140412  1.143337  3.2603131421293_at hDGFl1  2.140412  1.143337  3.260313

...

df.select(lambda x: x == 'A', axis=1)

select() takes a function which operates on the label(s) along axis and thefunction should return a boolean.

http://pandas.pydata.org/pandas-docs/stable/indexing.html#the-select-method