How does Keras handle multilabel classification?
In short
Don't use softmax
.
Use sigmoid
for activation of your output layer.
Use binary_crossentropy
for loss function.
Use predict
for evaluation.
Why
In softmax
when increasing score for one label, all others are lowered (it's a probability distribution). You don't want that when you have multiple labels.
Complete Code
from tensorflow.keras.models import Sequentialfrom tensorflow.keras.layers import Dense, Dropout, Activationfrom tensorflow.keras.optimizers import SGDmodel = Sequential()model.add(Dense(5000, activation='relu', input_dim=X_train.shape[1]))model.add(Dropout(0.1))model.add(Dense(600, activation='relu'))model.add(Dropout(0.1))model.add(Dense(y_train.shape[1], activation='sigmoid'))sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)model.compile(loss='binary_crossentropy', optimizer=sgd)model.fit(X_train, y_train, epochs=5, batch_size=2000)preds = model.predict(X_test)preds[preds>=0.5] = 1preds[preds<0.5] = 0# score = compare preds and y_test