How to create mean and s.d. columns in data.table How to create mean and s.d. columns in data.table r r

How to create mean and s.d. columns in data.table


.SD is itself a data.table
Thus, when you take mean(.SD) you are (attempting) to take the mean of an entire data.table

The function mean() does not know what to do with the data.table and returns NA

Have a look

## the .SD in your question is the same as test[, c('A','B','C','D')]## try taking its meanmean(test[, c('A','B','C','D')])# Warning in mean.default(test[, c("A", "B", "C", "D")]) :#   argument is not numeric or logical: returning NA# [1] NA

try this instead

use lapply(.SD, mean) for column-wise or apply(.SD, 1, mean) for row-wise


You can make mean work by using rowMeans instead, and thus avoid using apply (similar to the linked question)

test[,`:=`(mean_test = rowMeans(.SD),            sd_test = sd(.SD)),     by=id,.SDcols=c('A','B','C','D')]test#    id    A   B    C D mean_test   sd_test# 1:  1 2.00 3.0 4.00 5     3.500 1.2909944# 2:  2 3.75 4.5 5.25 6     4.875 0.9682458# 3:  3 5.50 6.0 6.50 7     6.250 0.6454972# 4:  4 7.25 7.5 7.75 8     7.625 0.3227486# 5:  5 9.00 9.0 9.00 9     9.000 0.0000000


Rather as a fun fact, one can use a vector of columns in mean() and sd():

test[, `:=` (mean = mean(c(A,B,C,D)),             sd   = sd(c(A,B,C,D))),  by=id]test#    id    A   B    C D   mean        sd# 1:  1 2.00 3.0 4.00 5  3.500 1.2909944# 2:  2 3.75 4.5 5.25 6  4.875 0.9682458# 3:  3 5.50 6.0 6.50 7  6.250 0.6454972# 4:  4 7.25 7.5 7.75 8  7.625 0.3227486# 5:  5 9.00 9.0 9.00 9  9.000 0.0000000

And you can also use quote() and eval():

cols <- quote(c(A,B,C,D))test[, ':=' (mean = mean(eval(cols)),              sd  = sd(eval(cols))),  by=id]