TPR & FPR Curve for different classifiers - kNN, NaiveBayes, Decision Trees in R TPR & FPR Curve for different classifiers - kNN, NaiveBayes, Decision Trees in R r r

TPR & FPR Curve for different classifiers - kNN, NaiveBayes, Decision Trees in R


ROC curve

The ROC curve you provided for knn11 classifier looks off - it is below the diagonal indicating that your classifier assigns class labels correctly less than 50% of the time. Most likely what happened there is that you provided wrong class labels or wrong probabilities. If in training you used class labels of 0 and 1 - those same class labels should be passed to ROC curve in the same order (without 0 and one flipping).

Another less likely possibility is that you have a very weird dataset.

Probabilities for other classifiers

ROC curve was developed to call events from the radar. Technically it is closely related to predicting an event - probability that you correctly guess the even of a plane approaching from the radar. So it uses one probability. This can be confusing when someone does classification on two classes where "hit" probabilities are not evident, like in your case where you have cases and controls.

However any two class classification can be termed in terms of "hits" and "misses" - you just have to select a class which you will call an "event". In your case having diabetes might be called an event.

So from this table:

 tested_negative tested_positive [1,]    5.787252e-03       0.9942127 [2,]    8.433584e-01       0.1566416 [3,]    7.880800e-09       1.0000000 [4,]    7.568920e-01       0.2431080 [5,]    4.663958e-01       0.5336042

You would only have to select one probability - that of an event - probably "tested_positive". Another one "tested_negative" is just 1-tested_positive because when classifier things that a particular person has diabetes with 79% chance - he at the same time "thinks" that there is a 21% chance of that person not having diabetes. But you only need one number to express this idea, so knn only returns one, while other classifier can return two.

I don't know which library you used for decision trees so cannot help with the output of that classifier.


Looks like you are something fundamentally wrong. enter image description here

Ideally KNN graph looks like above one. Here are few point you can use.

  1. Calculate distance in you code.
  2. Use below code for prediction in python

Predicted class

print(model_name.predict(test))

3 nearest neighbors

print(model_name.kneighbors(test)[1])