How to select the rows with maximum values in each group with dplyr? [duplicate]
Try this:
result <- df %>% group_by(A, B) %>% filter(value == max(value)) %>% arrange(A,B,C)
Seems to work:
identical( as.data.frame(result), ddply(df, .(A, B), function(x) x[which.max(x$value),]))#[1] TRUE
As pointed out in the comments, slice
may be preferred here as per @RoyalITS' answer below if you strictly only want 1 row per group. This answer will return multiple rows if there are multiple with an identical maximum value.
You can use top_n
df %>% group_by(A, B) %>% top_n(n=1)
This will rank by the last column (value
) and return the top n=1
rows.
Currently, you can't change the this default without causing an error (See https://github.com/hadley/dplyr/issues/426)