How to plot precision and recall of multiclass classifier? How to plot precision and recall of multiclass classifier? python python

How to plot precision and recall of multiclass classifier?


From scikit-learn documentation:

Precision-recall curves are typically used in binary classification tostudy the output of a classifier. In order to extend theprecision-recall curve and average precision to multi-class ormulti-label classification, it is necessary to binarize the output.One curve can be drawn per label, but one can also draw aprecision-recall curve by considering each element of the labelindicator matrix as a binary prediction (micro-averaging).

ROC curves are typically used in binary classification to study theoutput of a classifier. In order to extend ROC curve and ROC area tomulti-class or multi-label classification, it is necessary to binarizethe output. One ROC curve can be drawn per label, but one can alsodraw a ROC curve by considering each element of the label indicatormatrix as a binary prediction (micro-averaging).

Therefore, you should binarize the output and consider precision-recall and roc curves for each class. Moreover, you are going to use predict_proba to get class probabilities.

I divide the code into three parts:

  1. general settings, learning and prediction
  2. precision-recall curve
  3. ROC curve

1. general settings, learning and prediction

from sklearn.datasets import fetch_mldatafrom sklearn.model_selection import train_test_splitfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.multiclass import OneVsRestClassifierfrom sklearn.metrics import precision_recall_curve, roc_curvefrom sklearn.preprocessing import label_binarizeimport matplotlib.pyplot as plt#%matplotlib inlinemnist = fetch_mldata("MNIST original")n_classes = len(set(mnist.target))Y = label_binarize(mnist.target, classes=[*range(n_classes)])X_train, X_test, y_train, y_test = train_test_split(mnist.data,                                                    Y,                                                    random_state = 42)clf = OneVsRestClassifier(RandomForestClassifier(n_estimators=50,                             max_depth=3,                             random_state=0))clf.fit(X_train, y_train)y_score = clf.predict_proba(X_test)

2. precision-recall curve

# precision recall curveprecision = dict()recall = dict()for i in range(n_classes):    precision[i], recall[i], _ = precision_recall_curve(y_test[:, i],                                                        y_score[:, i])    plt.plot(recall[i], precision[i], lw=2, label='class {}'.format(i))    plt.xlabel("recall")plt.ylabel("precision")plt.legend(loc="best")plt.title("precision vs. recall curve")plt.show()

enter image description here

3. ROC curve

# roc curvefpr = dict()tpr = dict()for i in range(n_classes):    fpr[i], tpr[i], _ = roc_curve(y_test[:, i],                                  y_score[:, i]))    plt.plot(fpr[i], tpr[i], lw=2, label='class {}'.format(i))plt.xlabel("false positive rate")plt.ylabel("true positive rate")plt.legend(loc="best")plt.title("ROC curve")plt.show()

enter image description here