R - Create a new variable where each observation depends on another table and other variables in the data frame
Sure, it can be done in data.table:
library(data.table)setDT(df)df[ melt(Inc, id.var="ZIP2", variable.name="eth", value.name="Inc"), Inc := i.Inc, on=c(ZIP1 = "ZIP2","eth") ]
The syntax for this "merge-assign" operation is X[i, Xcol := expression, on=merge_cols]
.
You can run the i = melt(Inc, id.var="ZIP", variable.name="eth", value.name="Inc")
part on its own to see how it works. Inside the merge, columns from i
can be referred to with i.*
prefixes.
Alternately...
setDT(df)setDT(Inc)df[, Inc := Inc[.(ZIP1), eth, on="ZIP2", with=FALSE], by=eth]
This is built on a similar idea. The package vignettes are a good place to start for this sort of syntax.
We can use row/column
indexing
df$Inc <- Inc[cbind(match(df$ZIP1, Inc$ZIP2), match(df$eth, colnames(Inc)))]df# eth ZIP1 Inc#1 A 1 56#2 B 1 49#3 B 2 10#4 A 3 43#5 C 5 17
What about this?
library(reshape2)merge(df, melt(Inc, id="ZIP2"), by.x = c("ZIP1", "eth"), by.y = c("ZIP2", "variable")) ZIP1 eth value1 1 A 562 1 B 493 2 B 104 3 A 435 5 C 17