Programming with dplyr using string as input
dplyr >= 1.0
Use combination of double braces and the across function:
my_summarise2 <- function(df, group_var) { df %>% group_by(across({{ group_var }})) %>% summarise(mpg = mean(mpg))}my_summarise2(mtcars, "cyl")# A tibble: 3 x 2# cyl mpg# <dbl> <dbl># 1 4 26.7# 2 6 19.7# 3 8 15.1# same result as above, passing cyl without quotesmy_summarise(mtcars, cyl)
dplyr < 1.0
As far as I know, you could use as.name
or sym
(from the rlang
package - I don't know if dplyr
will import it eventually):
library(dplyr)my_summarise <- function(df, var) { var <- rlang::sym(var) df %>% group_by(!!var) %>% summarise(mpg = mean(mpg))}
or
my_summarise <- function(df, var) { var <- as.name(var) df %>% group_by(!!var) %>% summarise(mpg = mean(mpg))}my_summarise(mtcars, "cyl")# # A tibble: 3 × 2# cyl mpg# <dbl> <dbl># 1 4 26.66364# 2 6 19.74286# 3 8 15.10000
Using the .data
pronoun from rlang is another option that works directly with column names stored as strings.
The function with .data
would look like
my_summarise <- function(df, var) { df %>% group_by(.data[[var]]) %>% summarise(mpg = mean(mpg))}my_summarise(mtcars, "cyl")# A tibble: 3 x 2 cyl mpg <dbl> <dbl>1 4 26.72 6 19.73 8 15.1
This is how to do it using only dplyr
and the very useful as.name
function from base R:
my_summarise <- function(df, var) { varName <- as.name(var) enquo_varName <- enquo(varName) df %>% group_by(!!enquo_varName) %>% summarise(a = mean(a))}my_summarise(df, "g1")
Basically, with as.name()
we generate a name object that matches var
(here var
is a string). Then, following Programming with dplyr, we use enquo()
to look at that name and return the associated value as a quosure. This quosure can then be unquoted inside the group_by()
call using !!
.