Convert columns of arbitrary class to the class of matching columns in another data.table Convert columns of arbitrary class to the class of matching columns in another data.table r r

Convert columns of arbitrary class to the class of matching columns in another data.table


Not very elegant but you may 'build' the as.* call like this:

for (x in colnames(A)) { A[,x] <- eval( call( paste0("as.", class(B[,x])), A[,x]) )}


This is one very crude way to ensure common classes:

library(magrittr)cols = intersect(names(A), names(B))r    = rbindlist(list(A = A, B = B[, ..cols]), idcol = TRUE)r[, (cols) := lapply(.SD, . %>% as.character %>% type.convert), .SDcols=cols]B[, (cols) := r[.id=="B", ..cols]]A[, (cols) := r[.id=="A", ..cols]]sapply(A, class); sapply(B, class)#      year   stratum # "integer" "integer" #      year   stratum        yr # "integer" "integer" "numeric" 

I don't like this solution:

  • I routinely use all-integer codes for IDs (like "00001", "02995"), and this would coerce those to actual integers, which is bad.
  • Who knows what this will do to fancy classes like Date or factor? This won't matter so much if you do this col-classes normalization as soon as you read data in, I suppose.

Data:

# slightly tweaked from OPA <- setDT(structure(list(year = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), stratum = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L)), .Names = c("year", "stratum"), row.names = c(NA, -45L), class = c("data.frame")))B <- setDT(structure(list(year = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3), stratum = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L), yr = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3)), .Names = c("year", "stratum", "yr"), row.names = c(NA, -45L), class = c("data.frame")))

Comment. If you have something against magrittr, use function(x) type.convert(as.character(x)) in place of the . %>% bit.


Based on the discussion in this question, and comments in this answer, I'm thinking I may have had it right, and just landed on an odd exception.

Note that the class doesn't change, but the technicality is that it doesn't matter (for my particular use-case that prompted the question). Below I show my "failed approach", but by following through to the merge, and the classes of the columns in the merged data.table, we can see why the approach works: integers will just get promoted.

s2c <- function (x, type = "list") {    as.call(lapply(c(type, x), as.symbol))}# In this case, I can assume all columns of A can be found in B# I am also able to assume that the desired conversion is possibleB.class <- sapply(B[,eval(s2c(names(A)))], class)for(col in names(A)){    set(A, j=col, value=as(A[[col]], B.class[col]))}# Below here is new from what I tried in questionAB <- data.table:::merge.data.table(A, B, all=T, by=c("stratum","year"))sapply(AB, class)  stratum      year        bt        yr "integer" "numeric" "numeric" "numeric" 

Although the problem in the question isn't solved by this answer, I figured I'd post to point out that the failure to convert "integer" to "numeric" might not be a problem in many situations, so this is a straightforward, albeit circumstantial, solution.