R - Create a new variable where each observation depends on another table and other variables in the data frame R - Create a new variable where each observation depends on another table and other variables in the data frame r r

R - Create a new variable where each observation depends on another table and other variables in the data frame


Sure, it can be done in data.table:

library(data.table)setDT(df)df[ melt(Inc, id.var="ZIP2", variable.name="eth", value.name="Inc"),   Inc := i.Inc, on=c(ZIP1 = "ZIP2","eth") ]

The syntax for this "merge-assign" operation is X[i, Xcol := expression, on=merge_cols].

You can run the i = melt(Inc, id.var="ZIP", variable.name="eth", value.name="Inc") part on its own to see how it works. Inside the merge, columns from i can be referred to with i.* prefixes.


Alternately...

setDT(df)setDT(Inc)df[, Inc := Inc[.(ZIP1), eth, on="ZIP2", with=FALSE], by=eth]

This is built on a similar idea. The package vignettes are a good place to start for this sort of syntax.


We can use row/column indexing

df$Inc <- Inc[cbind(match(df$ZIP1, Inc$ZIP2), match(df$eth, colnames(Inc)))]df#  eth ZIP1 Inc#1   A    1  56#2   B    1  49#3   B    2  10#4   A    3  43#5   C    5  17


What about this?

library(reshape2)merge(df, melt(Inc, id="ZIP2"), by.x = c("ZIP1", "eth"), by.y = c("ZIP2", "variable"))  ZIP1 eth value1    1   A    562    1   B    493    2   B    104    3   A    435    5   C    17