How to retrieve the most repeated value in a column present in a data frame How to retrieve the most repeated value in a column present in a data frame r r

How to retrieve the most repeated value in a column present in a data frame


tail(names(sort(table(Forbes2000$category))), 1)


In case two or more categories may be tied for most frequent, use something like this:

x <- c("Insurance", "Insurance", "Capital Goods", "Food markets", "Food markets")tt <- table(x)names(tt[tt==max(tt)])[1] "Food markets" "Insurance" 


Another way with the data.table package, which is faster for large data sets:

set.seed(1)x=sample(seq(1,100), 5000000, replace = TRUE)

method 1 (solution proposed above)

start.time <- Sys.time()tt <- table(x)names(tt[tt==max(tt)])end.time <- Sys.time()time.taken <- end.time - start.timetime.taken

Time difference of 4.883488 secs

method 2 (DATA TABLE)

start.time <- Sys.time()ds <- data.table( x )setkey(ds, x)sorted <- ds[,.N,by=list(x)]most_repeated_value <- sorted[order(-N)]$x[1]most_repeated_valueend.time <- Sys.time()time.taken <- end.time - start.timetime.taken

Time difference of 0.328033 secs