ROC curve from training data in caret ROC curve from training data in caret r r

ROC curve from training data in caret


There is just the savePredictions = TRUE argument missing from ctrl (this also works for other resampling methods):

library(caret)library(mlbench)data(Sonar)ctrl <- trainControl(method="cv",                      summaryFunction=twoClassSummary,                      classProbs=T,                     savePredictions = T)rfFit <- train(Class ~ ., data=Sonar,                method="rf", preProc=c("center", "scale"),                trControl=ctrl)library(pROC)# Select a parameter settingselectedIndices <- rfFit$pred$mtry == 2# Plot:plot.roc(rfFit$pred$obs[selectedIndices],         rfFit$pred$M[selectedIndices])

ROC

Maybe I am missing something, but a small concern is that train always estimates slightly different AUC values than plot.roc and pROC::auc (absolute difference < 0.005), although twoClassSummary uses pROC::auc to estimate the AUC. Edit: I assume this occurs because the ROC from train is the average of the AUC using the separate CV-Sets and here we are calculating the AUC over all resamples simultaneously to obtain the overall AUC.

Update Since this is getting a bit of attention, here's a solution using plotROC::geom_roc() for ggplot2:

library(ggplot2)library(plotROC)ggplot(rfFit$pred[selectedIndices, ],        aes(m = M, d = factor(obs, levels = c("R", "M")))) +     geom_roc(hjust = -0.4, vjust = 1.5) + coord_equal()

ggplot_roc


Here, I'm modifying the plot of @thei1e which others may find helpful.

Train model and make predictions

library(caret)library(ggplot2)library(mlbench)library(plotROC)data(Sonar)ctrl <- trainControl(method="cv", summaryFunction=twoClassSummary, classProbs=T,                     savePredictions = T)rfFit <- train(Class ~ ., data=Sonar, method="rf", preProc=c("center", "scale"),                trControl=ctrl)# Select a parameter settingselectedIndices <- rfFit$pred$mtry == 2

Updated ROC curve plot

g <- ggplot(rfFit$pred[selectedIndices, ], aes(m=M, d=factor(obs, levels = c("R", "M")))) +   geom_roc(n.cuts=0) +   coord_equal() +  style_roc()g + annotate("text", x=0.75, y=0.25, label=paste("AUC =", round((calc_auc(g))$AUC, 4)))

enter image description here


Updated 2019. This is the easiest way https://cran.r-project.org/web/packages/MLeval/index.html. Gets the optimal parameters from the Caret object and the probabilities then calculates a number of metrics and plots including: ROC curves, PR curves, PRG curves, and calibration curves. You can put multiple objects from different models into it to compare the results.

library(MLeval)library(caret)data(Sonar)ctrl <- trainControl(method="cv",   summaryFunction=twoClassSummary,   classProbs=T)rfFit <- train(Class ~ ., data=Sonar,   method="rf", preProc=c("center", "scale"),   trControl=ctrl)## run MLevalres <- evalm(rfFit)## get ROCres$roc## get calibration curveres$cc## get precision recall gain curveres$prg

enter image description here

enter image description here

enter image description here