Replace <NA> in a factor column Replace <NA> in a factor column r r

Replace <NA> in a factor column


1) addNA If fac is a factor addNA(fac) is the same factor but with NA added as a level. See ?addNA

To force the NA level to be 88:

facna <- addNA(fac)levels(facna) <- c(levels(fac), 88)

giving:

> facna [1] 1  2  3  3  4  88 2  4  88 3 Levels: 1 2 3 4 88

1a) This can be written in a single line as follows:

`levels<-`(addNA(fac), c(levels(fac), 88))

2) factor It can also be done in one line using the various arguments of factor like this:

factor(fac, levels = levels(addNA(fac)), labels = c(levels(fac), 88), exclude = NULL)

2a) or equivalently:

factor(fac, levels = c(levels(fac), NA), labels = c(levels(fac), 88), exclude = NULL)

3) ifelse Another approach is:

factor(ifelse(is.na(fac), 88, paste(fac)), levels = c(levels(fac), 88))

4) forcats The forcats package has a function for this:

library(forcats)fct_explicit_na(fac, "88")## [1] 1  2  3  3  4  88 2  4  88 3 ## Levels: 1 2 3 4 88

Note: We used the following for input fac

fac <- structure(c(1L, 2L, 3L, 3L, 4L, NA, 2L, 4L, NA, 3L), .Label = c("1", "2", "3", "4"), class = "factor")

Update: Have improved (1) and added (1a). Later added (4).


other way to do is:

#check levelslevels(df$a)#[1] "3"  "4"  "7"  "9"  "10"#add new factor level. i.e 88 in our exampledf$a = factor(df$a, levels=c(levels(df$a), 88))#convert all NA's to 88df$a[is.na(df$a)] = 88#check levels againlevels(df$a)#[1] "3"  "4"  "7"  "9"  "10" "88"


The basic concept of a factor variable is that it can only take specific values, i.e., the levels. A value not in the levels is invalid.

You have two possibilities:

If you have a variable that follows this concept, make sure to define all levels when you create it, even those without corresponding values.

Or make the variable a character variable and work with that.

PS: Often these problems result from data import. For instance, what you show there looks like it should be a numeric variable and not a factor variable.