Find and replace substrings in a Pandas dataframe ignore case Find and replace substrings in a Pandas dataframe ignore case pandas pandas

Find and replace substrings in a Pandas dataframe ignore case


Same as you'd do with the standard regex, using the i flag.

df = df.replace('(?i)Number', 'NewWord', regex=True)

Granted, df.replace is limiting in the sense that flags must be passed as part of the regex string (rather than flags). If this was using str.replace, you could've used case=False or flags=re.IGNORECASE.


Simply use case=False in str.replace.

Example:

df = pd.DataFrame({'col':['this is a Number', 'and another NuMBer', 'number']})>>> df                  col0    this is a Number1  and another NuMBer2              numberdf['col'] = df['col'].str.replace('Number', 'NewWord', case=False)>>> df                   col0    this is a NewWord1  and another NewWord2              NewWord

[Edit]: In the case of having multiple columns you are looking for your substring in, you can select all columns with object dtypes, and apply the above solution to them. Example:

>>> df                  col                col2  col30    this is a Number  numbernumbernumber     11  and another NuMBer                   x     22              number                   y     3str_columns = df.select_dtypes('object').columnsdf[str_columns] = (df[str_columns]                   .apply(lambda x: x.str.replace('Number', 'NewWord', case=False)))>>> df                   col                   col2  col30    this is a NewWord  NewWordNewWordNewWord     11  and another NewWord                      x     22              NewWord                      y     3


Brutish. This only works if the whole string is either 'Number' or 'NUMBER'. It will not replace those within a larger string. And of course, it is limited to just those two words.

df.replace(['Number', 'NUMBER'], 'NewWord')

More Brute Force
If it wasn't obvious enough, this is far inferior to @coldspeed's answer

import redf.applymap(lambda x: re.sub('number', 'NewWord', x, flags=re.IGNORECASE))

Or with a cue from @coldspeed's answer

df.applymap(lambda x: re.sub('(?i)number', 'NewWord', x))