drop = TRUE doesn't drop factor levels in data.frame while in vector it does drop = TRUE doesn't drop factor levels in data.frame while in vector it does r r

drop = TRUE doesn't drop factor levels in data.frame while in vector it does


The documentation clearly states:

drop : logical. If TRUE the result is coerced to the lowest possible dimension. The default is to drop if only one column is left, but not to drop if only one row is left.

This means that if drop = TRUE and the filtered data.frame results in a single column or row, the result is coerced to a vector/list instead of returning a single-column/single-row data.frame.

Therefore, this argument has no relation with levels dropping, and so the right way to eliminate exceeding levels is the one you mentioned (i.e. using droplevels function).


This is an stumbling block for many people, because "drop does something different", as Peter Dalgaard explains in http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg22459.html and digEmAll below.

If you want what you want use:

d2[] <- lapply(d2, function(x) if (is.factor(x)) factor(x) else x) 


What documentation says is

If TRUE the result is coerced to the lowest possible dimension.

So it is related to dimension, not to factor levels:

df[, 1]# [1] europe  asia    oceania# Levels: asia europe oceaniadf[, 1, drop = FALSE]#         a# 1  europe# 2    asia# 3 oceania

Dropping factor levels is a different problem. Here is a case (?'[.factor') where argument drop appears for this purpose:

ff <- factor(c('AA', 'BA', 'CA'))ff[1:2, drop = TRUE]# [1] AA BA# Levels: AA BA