Sample random rows in dataframe
First make some data:
> df = data.frame(matrix(rnorm(20), nrow=10))> df X1 X21 0.7091409 -1.40613612 -1.1334614 -0.19738463 2.3343391 -0.43850714 -0.9040278 -0.65936775 0.4180331 -1.25924156 0.7572246 -0.54636557 -0.8996483 0.42311178 -1.0356774 -0.16408839 -0.3983045 0.715750610 -0.9060305 2.3234110
Then select some rows at random:
> df[sample(nrow(df), 3), ] X1 X29 -0.3983045 0.71575062 -1.1334614 -0.197384610 -0.9060305 2.3234110
The answer John Colby gives is the right answer. However if you are a dplyr
user there is also the answer sample_n
:
sample_n(df, 10)
randomly samples 10 rows from the dataframe. It calls sample.int
, so really is the same answer with less typing (and simplifies use in the context of magrittr since the dataframe is the first argument).
The data.table
package provides the function DT[sample(.N, M)]
, sampling M random rows from the data table DT
.
library(data.table)set.seed(10)mtcars <- data.table(mtcars)mtcars[sample(.N, 6)] mpg cyl disp hp drat wt qsec vs am gear carb1: 14.7 8 440.0 230 3.23 5.345 17.42 0 0 3 42: 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 43: 17.3 8 275.8 180 3.07 3.730 17.60 0 0 3 34: 21.5 4 120.1 97 3.70 2.465 20.01 1 0 3 15: 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 16: 15.5 8 318.0 150 2.76 3.520 16.87 0 0 3 2