Replace invalid values with None in Pandas DataFrame Replace invalid values with None in Pandas DataFrame pandas pandas

Replace invalid values with None in Pandas DataFrame


Actually in later versions of pandas this will give a TypeError:

df.replace('-', None)TypeError: If "to_replace" and "value" are both None then regex must be a mapping

You can do it by passing either a list or a dictionary:

In [11]: df.replace('-', df.replace(['-'], [None]) # or .replace('-', {0: None})Out[11]:      00  None1     32     23     54     15    -56    -17  None8     9

But I recommend using NaNs rather than None:

In [12]: df.replace('-', np.nan)Out[12]:     00  NaN1    32    23    54    15   -56   -17  NaN8    9


I prefer the solution using replace with a dict because of its simplicity and elegance:

df.replace({'-': None})

You can also have more replacements:

df.replace({'-': None, 'None': None})

And even for larger replacements, it is always obvious and clear what is replaced by what - which is way harder for long lists, in my opinion.


where is probably what you're looking for. So

data=data.where(data=='-', None) 

From the panda docs:

where [returns] an object of same shape as self and whose corresponding entries are from self where cond is True and otherwise are from other).