drop = TRUE doesn't drop factor levels in data.frame while in vector it does
The documentation clearly states:
drop : logical. If TRUE the result is coerced to the lowest possible dimension. The default is to drop if only one column is left, but not to drop if only one row is left.
This means that if drop = TRUE
and the filtered data.frame
results in a single column or row, the result is coerced to a vector/list instead of returning a single-column/single-row data.frame
.
Therefore, this argument has no relation with levels dropping, and so the right way to eliminate exceeding levels is the one you mentioned (i.e. using droplevels
function).
This is an stumbling block for many people, because "drop does something different", as Peter Dalgaard explains in http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg22459.html and digEmAll below.
If you want what you want use:
d2[] <- lapply(d2, function(x) if (is.factor(x)) factor(x) else x)
What documentation says is
If TRUE the result is coerced to the lowest possible dimension.
So it is related to dimension, not to factor levels:
df[, 1]# [1] europe asia oceania# Levels: asia europe oceaniadf[, 1, drop = FALSE]# a# 1 europe# 2 asia# 3 oceania
Dropping factor levels is a different problem. Here is a case (?'[.factor'
) where argument drop
appears for this purpose:
ff <- factor(c('AA', 'BA', 'CA'))ff[1:2, drop = TRUE]# [1] AA BA# Levels: AA BA