Remove rows with all or some NAs (missing values) in data.frame
Also check complete.cases
:
> final[complete.cases(final), ] gene hsap mmul mmus rnor cfam2 ENSG00000199674 0 2 2 2 26 ENSG00000221312 0 1 2 3 2
na.omit
is nicer for just removing all NA
's. complete.cases
allows partial selection by including only certain columns of the dataframe:
> final[complete.cases(final[ , 5:6]),] gene hsap mmul mmus rnor cfam2 ENSG00000199674 0 2 2 2 24 ENSG00000207604 0 NA NA 1 26 ENSG00000221312 0 1 2 3 2
Your solution can't work. If you insist on using is.na
, then you have to do something like:
> final[rowSums(is.na(final[ , 5:6])) == 0, ] gene hsap mmul mmus rnor cfam2 ENSG00000199674 0 2 2 2 24 ENSG00000207604 0 NA NA 1 26 ENSG00000221312 0 1 2 3 2
but using complete.cases
is quite a lot more clear, and faster.
Try na.omit(your.data.frame)
. As for the second question, try posting it as another question (for clarity).
tidyr
has a new function drop_na
:
library(tidyr)df %>% drop_na()# gene hsap mmul mmus rnor cfam# 2 ENSG00000199674 0 2 2 2 2# 6 ENSG00000221312 0 1 2 3 2df %>% drop_na(rnor, cfam)# gene hsap mmul mmus rnor cfam# 2 ENSG00000199674 0 2 2 2 2# 4 ENSG00000207604 0 NA NA 1 2# 6 ENSG00000221312 0 1 2 3 2