How to count nan values in a pandas DataFrame?
If you want to count only NaN values in column 'a'
of a DataFrame df
, use:
len(df) - df['a'].count()
Here count()
tells us the number of non-NaN values, and this is subtracted from the total number of values (given by len(df)
).
To count NaN values in every column of df
, use:
len(df) - df.count()
If you want to use value_counts
, tell it not to drop NaN values by setting dropna=False
(added in 0.14.1):
dfv = dfd['a'].value_counts(dropna=False)
This allows the missing values in the column to be counted too:
3 3NaN 2 1 1Name: a, dtype: int64
The rest of your code should then work as you expect (note that it's not necessary to call sum
; just print("nan: %d" % dfv[np.nan])
suffices).
A good clean way to count all NaN's in all columns of your dataframe would be ...
import pandas as pd import numpy as npdf = pd.DataFrame({'a':[1,2,np.nan], 'b':[np.nan,1,np.nan]})print(df.isna().sum().sum())
Using a single sum, you get the count of NaN's for each column. The second sum, sums those column sums.