Subset of rows containing NA (missing) values in a chosen column of a data frame
Never use =='NA' to test for missing values. Use is.na()
instead. This should do it:
new_DF <- DF[rowSums(is.na(DF)) > 0,]
or in case you want to check a particular column, you can also use
new_DF <- DF[is.na(DF$Var),]
In case you have NA character values, first run
Df[Df=='NA'] <- NA
to replace them with missing values.
NA is a special value in R, do not mix up the NA value with the "NA" string. Depending on the way the data was imported, your "NA" and "NULL" cells may be of various type (the default behavior is to convert "NA" strings to NA values, and let "NULL" strings as is).
If using read.table() or read.csv(), you should consider the "na.strings" argument to do clean data import, and always work with real R NA values.
An example, working in both cases "NULL" and "NA" cells :
DF <- read.csv("file.csv", na.strings=c("NA", "NULL"))new_DF <- subset(DF, is.na(DF$Var2))