One Hot Encoding using numpy [duplicate] One Hot Encoding using numpy [duplicate] python python

One Hot Encoding using numpy [duplicate]


Usually, when you want to get a one-hot encoding for classification in machine learning, you have an array of indices.

import numpy as npnb_classes = 6targets = np.array([[2, 3, 4, 0]]).reshape(-1)one_hot_targets = np.eye(nb_classes)[targets]

The one_hot_targets is now

array([[[ 0.,  0.,  1.,  0.,  0.,  0.],        [ 0.,  0.,  0.,  1.,  0.,  0.],        [ 0.,  0.,  0.,  0.,  1.,  0.],        [ 1.,  0.,  0.,  0.,  0.,  0.]]])

The .reshape(-1) is there to make sure you have the right labels format (you might also have [[2], [3], [4], [0]]). The -1 is a special value which means "put all remaining stuff in this dimension". As there is only one, it flattens the array.

Copy-Paste solution

def get_one_hot(targets, nb_classes):    res = np.eye(nb_classes)[np.array(targets).reshape(-1)]    return res.reshape(list(targets.shape)+[nb_classes])

Package

You can use mpu.ml.indices2one_hot. It's tested and simple to use:

import mpu.mlone_hot = mpu.ml.indices2one_hot([1, 3, 0], nb_classes=5)


Something like :

np.array([int(i == 5) for i in range(10)])

Should do the trick.But I suppose there exist other solutions using numpy.

edit : the reason why your formula does not work : np.put does not return anything, it just modifies the element given in first parameter. The good answer while using np.put() is :

a = np.zeros(10)np.put(a,5,1)

The problem is that it can't be done in one line, as you need to define the array before passing it to np.put()


You could use List comprehension:

[0 if i !=5 else 1 for i in range(10)]

turns to

[0,0,0,0,0,1,0,0,0,0]