How to set a cell to NaN in a pandas dataframe How to set a cell to NaN in a pandas dataframe python python

How to set a cell to NaN in a pandas dataframe


just use replace:

In [106]:df.replace('N/A',np.NaN)Out[106]:    x    y0  10   121  50   112  18  NaN3  32   134  47   155  20  NaN

What you're trying is called chain indexing: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

You can use loc to ensure you operate on the original dF:

In [108]:df.loc[df['y'] == 'N/A','y'] = np.nandfOut[108]:    x    y0  10   121  50   112  18  NaN3  32   134  47   155  20  NaN


While using replace seems to solve the problem, I would like to propose an alternative. Problem with mix of numeric and some string values in the column not to have strings replaced with np.nan, but to make whole column proper. I would bet that original column most likely is of an object type

Name: y, dtype: object

What you really need is to make it a numeric column (it will have proper type and would be quite faster), with all non-numeric values replaced by NaN.

Thus, good conversion code would be

pd.to_numeric(df['y'], errors='coerce')

Specify errors='coerce' to force strings that can't be parsed to a numeric value to become NaN. Column type would be

Name: y, dtype: float64


You can use replace:

df['y'] = df['y'].replace({'N/A': np.nan})

Also be aware of the inplace parameter for replace. You can do something like:

df.replace({'N/A': np.nan}, inplace=True)

This will replace all instances in the df without creating a copy.

Similarly, if you run into other types of unknown values such as empty string or None value:

df['y'] = df['y'].replace({'': np.nan})df['y'] = df['y'].replace({None: np.nan})

Reference: Pandas Latest - Replace