How to replace NA values in a table for selected columns How to replace NA values in a table for selected columns r r

How to replace NA values in a table for selected columns


You can do:

x[, 1:2][is.na(x[, 1:2])] <- 0

or better (IMHO), use the variable names:

x[c("a", "b")][is.na(x[c("a", "b")])] <- 0

In both cases, 1:2 or c("a", "b") can be replaced by a pre-defined vector.


Edit 2020-06-15

Since data.table 1.12.4 (Oct 2019), data.table gains two functions to facilitate this: nafill and setnafill.

nafill operates on columns:

cols = c('a', 'b')y[ , (cols) := lapply(.SD, nafill, fill=0), .SDcols = cols]

setnafill operates on tables (the replacements happen by-reference/in-place)

setnafill(y, cols=cols, fill=0)# print y to show the effecty[]

This will also be more efficient than the other options; see ?nafill for more, the last-observation-carried-forward (LOCF) and next-observation-carried-backward (NOCB) versions of NA imputation for time series.


This will work for your data.table version:

for (col in c("a", "b")) y[is.na(get(col)), (col) := 0]

Alternatively, as David Arenburg points out below, you can use set (side benefit - you can use it either on data.frame or data.table):

for (col in 1:2) set(x, which(is.na(x[[col]])), col, 0)


Building on @Robert McDonald's tidyr::replace_na() answer, here are some dplyr options for controlling which columns the NAs are replaced:

library(tidyverse)# by column type:x %>%  mutate_if(is.numeric, ~replace_na(., 0))# select columns defined in vars(col1, col2, ...):x %>%  mutate_at(vars(a, b, c), ~replace_na(., 0))# all columns:x %>%  mutate_all(~replace_na(., 0))