OneHotEncoder categorical_features deprecated, how to transform specific column

python machine-learning categorical-data one-hot-encoding

There is actually 2 warnings :

FutureWarning: The handling of integer data will change in version 0.22. Currently, the categories are determined based on the range [0, max(values)], while in the future they will be determined based on the unique values. If you want the future behaviour and silence this warning, you can specify "categories='auto'". In case you used a LabelEncoder before this OneHotEncoder to convert the categories to integers, then you can now use the OneHotEncoder directly.

and the second :

The 'categorical_features' keyword is deprecated in version 0.20 and will be removed in 0.22. You can use the ColumnTransformer instead.
"use the ColumnTransformer instead.", DeprecationWarning)

In the future, you should not define the columns in the OneHotEncoder directly, unless you want to use "categories='auto'". The first message also tells you to use OneHotEncoder directly, without the LabelEncoder first.Finally, the second message tells you to use ColumnTransformer, which is like a Pipe for columns transformations.

Here is the equivalent code for your case :

from sklearn.compose import ColumnTransformer ct = ColumnTransformer([("Name_Of_Your_Step", OneHotEncoder(),[0])], remainder="passthrough")) # The last arg ([0]) is the list of columns you want to transform in this stepct.fit_transform(X)

For the above example;

Encoding Categorical data (Basically Changing Text to Numerical data i.e, Country Name)

from sklearn.preprocessing import LabelEncoder, OneHotEncoderfrom sklearn.compose import ColumnTransformer#Encode Country Columnlabelencoder_X = LabelEncoder()X[:,0] = labelencoder_X.fit_transform(X[:,0])ct = ColumnTransformer([("Country", OneHotEncoder(), [0])], remainder = 'passthrough')X = ct.fit_transform(X)

python machine-learning categorical-data one-hot-encoding

As of version 0.22, you can write the same code as below:

from sklearn.preprocessing import OneHotEncoderfrom sklearn.compose import ColumnTransformerct = ColumnTransformer([("Country", OneHotEncoder(), [0])], remainder = 'passthrough')X = ct.fit_transform(X)

As you can see, you don't need to use LabelEncoder anymore.

python machine-learning categorical-data one-hot-encoding

transformer = ColumnTransformer(    transformers=[        ("Country",        # Just a name         OneHotEncoder(), # The transformer class         [0]            # The column(s) to be applied on.         )    ], remainder='passthrough')X = transformer.fit_transform(X)

Reminder will keep previous data while [0]th column will replace will be encoded

CodeHunter

OneHotEncoder categorical_features deprecated, how to transform specific column

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last