Vectorized IF statement in R?
x <- seq(0.1,10,0.1)> x [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4 1.5 [16] 1.6 1.7 1.8 1.9 2.0 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 3.0 [31] 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4.0 4.1 4.2 4.3 4.4 4.5 [46] 4.6 4.7 4.8 4.9 5.0 5.1 5.2 5.3 5.4 5.5 5.6 5.7 5.8 5.9 6.0 [61] 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 6.9 7.0 7.1 7.2 7.3 7.4 7.5 [76] 7.6 7.7 7.8 7.9 8.0 8.1 8.2 8.3 8.4 8.5 8.6 8.7 8.8 8.9 9.0 [91] 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 10.0> ifelse(x < 5, 1, 2) [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [38] 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 [75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
y <- if (x < 5) 1 else 2
does not operate on the whole vector (the warning you receive tells you only the first element of the condition will be used). You want ifelse
:
y <- ifelse(x < 5, 1, 2)
ifelse
operates on the whole logical vector, element-by-element. if
only accepts one logical value. See ?"if"
and ?ifelse
For completeness: In big vectors, you can use the indices to speed things up (we do that often in simulations, where functions typically run 1000 to 10000 times). But as long as it isn't necessary, just use ifelse
. This reads a lot easier.
> set.seed(100)> x <- runif(1000,1,10)> system.time(replicate(10000,{+ y <- ifelse(x < 5,1,2)+ })) user system elapsed 2.56 0.08 2.64 > system.time(replicate(10000,{+ y <- rep(2,length(x))+ y[x < 5]<- 1+ })) user system elapsed 0.48 0.00 0.48