Classification - Usage of factor levels Classification - Usage of factor levels r r

Classification - Usage of factor levels


It's not exactly possible for me to reproduce your error, but my educated guess is that the error message tells you everything you need to know:

At least one of the class levels is not a valid R variable name. This will cause errors when class probabilities are generated because the variables names will be converted to X0, X1. Please use factor levels that can be used as valid R variable names.

Emphasis mine. Looking at your response variable, its levels are "0" and "1", these aren't valid variable names in R (you can't do 0 <- "my value"). Presumably this problem will go away if you rename the levels of the response variable with something like

levels(training.dt$churn) <- c("first_class", "second_class")

as per this Q.


How about this base function:

 make.names(churn) ~ .,

to "make syntactically valid names out of character vectors"?

Source


Adding to the correct answer of @einar, here's the dplyr syntax of converting the factor levels:

training.dt  %>%   mutate(churn = factor(churn,           levels = make.names(levels(churn))))

I slightly prefer to change only the labels of the factor levels, as the levels change the underlying data, like this:

training.dt  %>%   mutate(churn = factor(churn,           labels = make.names(levels(churn))))