How to use the 'sweep' function
sweep()
is typically used when you operate a matrix by row or by column, and the other input of the operation is a different value for each row / column. Whether you operate by row or column is defined by MARGIN, as for apply()
. The values used for what I called "the other input" is defined by STATS. So, for each row (or column), you will take a value from STATS and use in the operation defined by FUN.
For instance, if you want to add 1 to the 1st row, 2 to the 2nd, etc. of the matrix you defined, you will do:
sweep (M, 1, c(1: 4), "+")
I frankly did not understand the definition in the R documentation either, I just learned by looking up examples.
sweep() can be great for systematically manipulating a large matrix either column by column, or row by row, as shown below:
> print(size) Weight Waist Height[1,] 130 26 140[2,] 110 24 155[3,] 118 25 142[4,] 112 25 175[5,] 128 26 170> sweep(size, 2, c(10, 20, 30), "+") Weight Waist Height[1,] 140 46 170[2,] 120 44 185[3,] 128 45 172[4,] 122 45 205[5,] 138 46 200
Granted, this example is simple, but changing the STATS and FUN argument, other manipulations are possible.
This question is a bit old, but since I've recently faced this problem a typical use of sweep can be found in the source code for the stats function cov.wt
, used for computing weighted covariance matrices. I'm looking at the code in R 3.0.1. Here sweep
is used to subtract out column means before computing the covariance. On line 19 of the code the centering vector is derived:
center <- if (center) colSums(wt * x) else 0
and on line 54 it is swept out of the matrix
x <- sqrt(wt) * sweep(x, 2, center, check.margin = FALSE)
The author of the code is using the default value FUN = "-"
, which confused me for a while.