Predicting missing values with scikit-learn's Imputer module

python numpy scikit-learn prediction imputation

Per the documentation, sklearn.preprocessing.Imputer.fit_transform returns a new array, it doesn't alter the argument array. The minimal fix is therefore:

X = imp.fit_transform(X)

python numpy scikit-learn prediction imputation

After scikit-learn version 0.20 the usage of impute module was changed. Now, we can use imputer like;

from sklearn.impute import SimpleImputerimpute = SimpleImputer(missing_values=np.nan, strategy='mean')impute.fit(X)X=impute.transform(X)

Pay attention:

Instead of 'NaN', np.nan is used

Don't need to use axis parameter

We can use imp or imputer instead of my impute variable

python numpy scikit-learn prediction imputation

Note: Due to the change in the sklearn library 'NaN' has to be replaced with np.nan as shown below.

 from sklearn.preprocessing import Imputer imputer = Imputer(missing_values= np.nan,strategy='mean',axis=0)   imputer = imputer.fit(X[:,1:3]) X[:,1:3]= imputer.transform(X[:,1:3])

CodeHunter

Predicting missing values with scikit-learn's Imputer module

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last