Converting a Pandas Dataframe column into one hot labels

python pandas sklearn-pandas one-hot-encoding

Here is an example of using sklearn.preprocessing.LabelBinarizer:

In [361]: from sklearn.preprocessing import LabelBinarizerIn [362]: lb = LabelBinarizer()In [363]: df['new'] = lb.fit_transform(df['ABC']).tolist()In [364]: dfOut[364]:  Col1 ABC        new0  XYZ   A  [1, 0, 0]1  XYZ   B  [0, 1, 0]2  XYZ   C  [0, 0, 1]

Pandas alternative:

In [370]: df['new'] = df['ABC'].str.get_dummies().values.tolist()In [371]: dfOut[371]:  Col1 ABC        new0  XYZ   A  [1, 0, 0]1  XYZ   B  [0, 1, 0]2  XYZ   C  [0, 0, 1]

python pandas sklearn-pandas one-hot-encoding

You can just use tolist():

df['ABC'] = pd.get_dummies(df.ABC).values.tolist()  Col1        ABC0  XYZ  [1, 0, 0]1  XYZ  [0, 1, 0]2  XYZ  [0, 0, 1]

python pandas sklearn-pandas one-hot-encoding

If you have a pd.DataFrame like this:

>>> df  Col1  A  B  C0  XYZ  1  0  01  XYZ  0  1  02  XYZ  0  0  1

You can always do something like this:

>>> df.apply(lambda s: list(s[1:]), axis=1)0    [1, 0, 0]1    [0, 1, 0]2    [0, 0, 1]dtype: object

Note, this is essentially a for-loop on the rows. Note, columns do not have list data-types, they must be object, which will make your data-frame operations not able to take advantage of the speed benefits of numpy.

CodeHunter

Converting a Pandas Dataframe column into one hot labels

Recent Posts

How can I color dots in a xy scatterplot according to column value?

How to update a claim in ASP.NET Identity?

What does {0} mean when initializing an object?

Accessing members of items in a JSONArray with Java

How to log SQL statements in Spring Boot?

Powershell Get-WebSite name parameter is ignored

How to detect scroll to bottom of html element

Java synchronized method

How to test controllers with CodeIgniter?

Detect Visual Composer

Matplotlib: Specify format of floats for tick labels

Rails join a list of strings with commas and "and" before the last