Pandas select all columns without NaN Pandas select all columns without NaN pandas pandas

Pandas select all columns without NaN


You can create with non-NaN columns using

df = df[df.columns[~df.isnull().all()]]

Or

null_cols = df.columns[df.isnull().all()]df.drop(null_cols, axis = 1, inplace = True)

If you wish to remove columns based on a certain percentage of NaNs, say columns with more than 90% data as null

cols_to_delete = df.columns[df.isnull().sum()/len(df) > .90]df.drop(cols_to_delete, axis = 1, inplace = True)


df[df.columns[~df.isnull().any()]] will give you a DataFrame with only the columns that have no null values, and should be the solution.

df[df.columns[~df.isnull().all()]] only removes the columns that have nothing but null values and leaves columns with even one non-null value.

df.isnull() will return a dataframe of booleans with the same shape as df. These bools will be True if the particular value is null and False if it isn't.

df.isnull().any() will return True for all columns with even one null. This is where I'm diverging from the accepted answer, as df.isnull().all() will not flag columns with even one value!


I assume that you wan't to get all the columns without any NaN. If that's the case, you can first get the name of the columns without any NaN using ~col.isnull.any(), then use that your columns.

I can think in the following code:

import pandas as pddf = pd.DataFrame({    'col1': [23, 54, pd.np.nan, 87],    'col2': [45, 39, 45, 32],    'col3': [pd.np.nan, pd.np.nan, 76, pd.np.nan,]})# This function will check if there is a null value in the columndef has_nan(col, threshold=0):    return col.isnull().sum() > threshold# Then you apply the "complement" of function to get the column with# no NaN.df.loc[:, ~df.apply(has_nan)]# ... or pass the threshold as parameter, if neededdf.loc[:, ~df.apply(has_nan, args=(2,))]