Understanding `scale` in R Understanding `scale` in R r r

Understanding `scale` in R


log simply takes the logarithm (base e, by default) of each element of the vector.
scale, with default settings, will calculate the mean and standard deviation of the entire vector, then "scale" each element by those values by subtracting the mean and dividing by the sd. (If you use scale(x, scale=FALSE), it will only subtract the mean but not divide by the std deviation.)

Note that this will give you the same values

   set.seed(1)   x <- runif(7)   # Manually scaling   (x - mean(x)) / sd(x)   scale(x)


It provides nothing else but a standardization of the data. The values it creates are known under several different names, one of them being z-scores ("Z" because the normal distribution is also known as the "Z distribution").

More can be found here:

http://en.wikipedia.org/wiki/Standard_score


This is a late addition but I was looking for information on the scale function myself and though it might help somebody else as well.

To modify the response from Ricardo Saporta a little bit.
Scaling is not done using standard deviation, at least not in version 3.6.1 of R, I base this on "Becker, R. (2018). The new S language. CRC Press." and my own experimentation.

X.man.scaled <- X/sqrt(sum(X^2)/(length(X)-1))X.aut.scaled <- scale(X, center = F)

The result of these rows are exactly the same, I show it without centering because of simplicity.

I would respond in a comment but did not have enough reputation.