Programming with dplyr using string as input Programming with dplyr using string as input r r

Programming with dplyr using string as input


dplyr >= 1.0

Use combination of double braces and the across function:

my_summarise2 <- function(df, group_var) {  df %>% group_by(across({{ group_var }})) %>%     summarise(mpg = mean(mpg))}my_summarise2(mtcars, "cyl")# A tibble: 3 x 2#    cyl   mpg#  <dbl> <dbl># 1     4  26.7# 2     6  19.7# 3     8  15.1# same result as above, passing cyl without quotesmy_summarise(mtcars, cyl)

dplyr < 1.0

As far as I know, you could use as.name or sym (from the rlang package - I don't know if dplyr will import it eventually):

library(dplyr)my_summarise <- function(df, var) {  var <- rlang::sym(var)  df %>%    group_by(!!var) %>%    summarise(mpg = mean(mpg))}

or

my_summarise <- function(df, var) {  var <- as.name(var)  df %>%    group_by(!!var) %>%    summarise(mpg = mean(mpg))}my_summarise(mtcars, "cyl")# # A tibble: 3 × 2#     cyl      mpg#   <dbl>    <dbl># 1     4 26.66364# 2     6 19.74286# 3     8 15.10000


Using the .data pronoun from rlang is another option that works directly with column names stored as strings.

The function with .data would look like

my_summarise <- function(df, var) {     df %>%          group_by(.data[[var]]) %>%          summarise(mpg = mean(mpg))}my_summarise(mtcars, "cyl")# A tibble: 3 x 2    cyl   mpg  <dbl> <dbl>1     4  26.72     6  19.73     8  15.1


This is how to do it using only dplyr and the very useful as.name function from base R:

my_summarise <- function(df, var) {  varName <- as.name(var)  enquo_varName <- enquo(varName)  df %>%    group_by(!!enquo_varName) %>%    summarise(a = mean(a))}my_summarise(df, "g1")

Basically, with as.name() we generate a name object that matches var (here var is a string). Then, following Programming with dplyr, we use enquo() to look at that name and return the associated value as a quosure. This quosure can then be unquoted inside the group_by() call using !!.