Drop unused factor levels in a subsetted data frame
All you should have to do is to apply factor() to your variable again after subsetting:
> subdf$letters[1] a b cLevels: a b c d esubdf$letters <- factor(subdf$letters)> subdf$letters[1] a b cLevels: a b c
EDIT
From the factor page example:
factor(ff) # drops the levels that do not occur
For dropping levels from all factor columns in a dataframe, you can use:
subdf <- subset(df, numbers <= 3)subdf[] <- lapply(subdf, function(x) if(is.factor(x)) factor(x) else x)
If you don't want this behaviour, don't use factors, use character vectors instead. I think this makes more sense than patching things up afterwards. Try the following before loading your data with read.table
or read.csv
:
options(stringsAsFactors = FALSE)
The disadvantage is that you're restricted to alphabetical ordering. (reorder is your friend for plots)