How to parametrize function calls in dplyr 0.7? How to parametrize function calls in dplyr 0.7? r r

How to parametrize function calls in dplyr 0.7?


dplyr will have a specialized group_by function group_by_at to deal with multiple grouping variables. It would be much easier to use the new member of the _at family:

# using the pre-release 0.6.0cols <- c("am","gear")mtcars %>%    group_by_at(.vars = cols) %>%    summarise(mean_cyl=mean(cyl))# Source: local data frame [4 x 3]# Groups: am [?]# # am  gear mean_cyl# <dbl> <dbl>    <dbl># 1     0     3 7.466667# 2     0     4 5.000000# 3     1     4 4.500000# 4     1     5 6.000000

The .vars argument accepts both character/numeric vector or column names generated by vars:

.vars

A list of columns generated by vars(), or a character vector ofcolumn names, or a numeric vector of column positions.


Here's the quick and dirty reference I wrote for myself.

# install.packages("rlang")library(tidyverse)dat <- data.frame(cat = sample(LETTERS[1:2], 50, replace = TRUE),                  cat2 = sample(LETTERS[3:4], 50, replace = TRUE),                  value = rnorm(50))

Representing column names with strings

Convert strings to symbol objects using rlang::sym and rlang::syms.

summ_var <- "value"group_vars <- c("cat", "cat2")summ_sym <- rlang::sym(summ_var)  # capture a single symbolgroup_syms <- rlang::syms(group_vars)  # creates list of symbolsdat %>%  group_by(!!!group_syms) %>%  # splice list of symbols into a function call  summarize(summ = sum(!!summ_sym)) # slice single symbol into call

If you use !! or !!! outside of dplyr functions you will get an error.

The usage of rlang::sym and rlang::syms is identical inside functions.

summarize_by <- function(df, summ_var, group_vars) {  summ_sym <- rlang::sym(summ_var)  group_syms <- rlang::syms(group_vars)  df %>%    group_by(!!!group_syms) %>%    summarize(summ = sum(!!summ_sym))}

We can then call summarize_by with string arguments.

summarize_by(dat, "value", c("cat", "cat2"))

Using non-standard evaluation for column/variable names

summ_quo <- quo(value)  # capture a single variable for NSEgroup_quos <- quos(cat, cat2)  # capture list of variables for NSEdat %>%  group_by(!!!group_quos) %>%  # use !!! with both quos and rlang::syms  summarize(summ = sum(!!summ_quo))  # use !! both quo and rlang::sym

Inside functions use enquo rather than quo. quos is okay though!?

summarize_by <- function(df, summ_var, ...) {  summ_quo <- enquo(summ_var)  # can only capture a single value!  group_quos <- quos(...)  # captures multiple values, also inside functions!?  df %>%    group_by(!!!group_quos) %>%    summarize(summ = sum(!!summ_quo))}

And then our function call is

summarize_by(dat, value, cat, cat2)


If you want to group by possibly more than one column, you can use quos

grouping_vars <- quos(am, gear)mtcars %>%  group_by(!!!grouping_vars) %>%  summarise(mean_cyl=mean(cyl))#      am  gear mean_cyl#   <dbl> <dbl>    <dbl># 1     0     3 7.466667# 2     0     4 5.000000# 3     1     4 4.500000# 4     1     5 6.000000

Right now, it doesn't seem like there's a great way to turn strings into quos. Here's one way that does work though

cols <- c("am","gear")grouping_vars <- rlang::parse_quosures(paste(cols, collapse=";"))mtcars %>%  group_by(!!!grouping_vars) %>%  summarise(mean_cyl=mean(cyl))#      am  gear mean_cyl#   <dbl> <dbl>    <dbl># 1     0     3 7.466667# 2     0     4 5.000000# 3     1     4 4.500000# 4     1     5 6.000000