Merge unequal dataframes and replace missing rows with 0 Merge unequal dataframes and replace missing rows with 0 r r

Merge unequal dataframes and replace missing rows with 0


Take a look at the help page for merge. The all parameter lets you specify different types of merges. Here we want to set all = TRUE. This will make merge return NA for the values that don't match, which we can update to 0 with is.na():

zz <- merge(df1, df2, all = TRUE)zz[is.na(zz)] <- 0> zz  x y1 a 02 b 13 c 04 d 05 e 0

Updated many years later to address follow up question

You need to identify the variable names in the second data table that you aren't merging on - I use setdiff() for this. Check out the following:

df1 = data.frame(x=c('a', 'b', 'c', 'd', 'e', NA))df2 = data.frame(x=c('a', 'b', 'c'),y1 = c(0,1,0), y2 = c(0,1,0))#merge as beforedf3 <- merge(df1, df2, all = TRUE)#columns in df2 not in df1unique_df2_names <- setdiff(names(df2), names(df1))df3[unique_df2_names][is.na(df3[, unique_df2_names])] <- 0 

Created on 2019-01-03 by the reprex package (v0.2.1)


Or, as an alternative to @Chase's code, being a recent plyr fan with a background in databases:

require(plyr)zz<-join(df1, df2, type="left")zz[is.na(zz)] <- 0


Another alternative with data.table.

EXAMPLE DATA

dt1 <- data.table(df1)dt2 <- data.table(df2)setkey(dt1,x)setkey(dt2,x)

CODE

dt2[dt1,list(y=ifelse(is.na(y),0,y))]