Drop unused factor levels in a subsetted data frame Drop unused factor levels in a subsetted data frame r r

Drop unused factor levels in a subsetted data frame


Since R version 2.12, there's a droplevels() function.

levels(droplevels(subdf$letters))


All you should have to do is to apply factor() to your variable again after subsetting:

> subdf$letters[1] a b cLevels: a b c d esubdf$letters <- factor(subdf$letters)> subdf$letters[1] a b cLevels: a b c

EDIT

From the factor page example:

factor(ff)      # drops the levels that do not occur

For dropping levels from all factor columns in a dataframe, you can use:

subdf <- subset(df, numbers <= 3)subdf[] <- lapply(subdf, function(x) if(is.factor(x)) factor(x) else x)


If you don't want this behaviour, don't use factors, use character vectors instead. I think this makes more sense than patching things up afterwards. Try the following before loading your data with read.table or read.csv:

options(stringsAsFactors = FALSE)

The disadvantage is that you're restricted to alphabetical ordering. (reorder is your friend for plots)