Predicting missing values with scikit-learn's Imputer module Predicting missing values with scikit-learn's Imputer module python python

Predicting missing values with scikit-learn's Imputer module


Per the documentation, sklearn.preprocessing.Imputer.fit_transform returns a new array, it doesn't alter the argument array. The minimal fix is therefore:

X = imp.fit_transform(X)


After scikit-learn version 0.20 the usage of impute module was changed. Now, we can use imputer like;

from sklearn.impute import SimpleImputerimpute = SimpleImputer(missing_values=np.nan, strategy='mean')impute.fit(X)X=impute.transform(X)

Pay attention:

Instead of 'NaN', np.nan is used

Don't need to use axis parameter

We can use imp or imputer instead of my impute variable


Note: Due to the change in the sklearn library 'NaN' has to be replaced with np.nan as shown below.

 from sklearn.preprocessing import Imputer imputer = Imputer(missing_values= np.nan,strategy='mean',axis=0)   imputer = imputer.fit(X[:,1:3]) X[:,1:3]= imputer.transform(X[:,1:3])