Using multiple criteria in subset function and logical operators
The correct operator is %in%
here. Here is an example with dummy data:
set.seed(1)dat <- data.frame(bf11 = sample(4, 10, replace = TRUE), foo = runif(10))
giving:
> head(dat) bf11 foo1 2 0.20597462 2 0.17655683 3 0.68702284 4 0.38410375 1 0.76984146 4 0.4976992
The subset of dat
where bf11
equals any of the set 1,2,3
is taken as follows using %in%
:
> subset(dat, subset = bf11 %in% c(1,2,3)) bf11 foo1 2 0.20597462 2 0.17655683 3 0.68702285 1 0.76984148 3 0.99190619 3 0.380035210 1 0.7774452
As to why your original didn't work, break it down to see the problem. Look at what 1||2||3
evaluates to:
> 1 || 2 || 3[1] TRUE
and you'd get the same using |
instead. As a result, the subset()
call would only return rows where bf11
was TRUE
(or something that evaluated to TRUE
).
What you could have written would have been something like:
subset(dat, subset = bf11 == 1 | bf11 == 2 | bf11 == 3)
Which gives the same result as my earlier subset()
call. The point is that you need a series of single comparisons, not a comparison of a series of options. But as you can see, %in%
is far more useful and less verbose in such circumstances. Notice also that I have to use |
as I want to compare each element of bf11
against 1
, 2
, and 3
, in turn. Compare:
> with(dat, bf11 == 1 || bf11 == 2)[1] TRUE> with(dat, bf11 == 1 | bf11 == 2) [1] TRUE TRUE FALSE FALSE TRUE FALSE FALSE FALSE FALSE TRUE
For your example, I believe the following should work:
myNewDataFrame <- subset(bigfive, subset = bf11 == 1 | bf11 == 2 | bf11 == 3)
See the examples in ?subset
for more. Just to demonstrate, a more complicated logical subset would be:
data(airquality)dat <- subset(airquality, subset = (Temp > 80 & Month > 5) | Ozone < 40)
And as Chase points out, %in%
would be more efficient in your example:
myNewDataFrame <- subset(bigfive, subset = bf11 %in% c(1, 2, 3))
As Chase also points out, make sure you understand the difference between |
and ||
. To see help pages for operators, use ?'||'
, where the operator is quoted.