Ellipses, eclipses and bulletses

cli
r
rlang
Author
Published

March 12, 2024

tl;dr

You can use {cli} and {rlang} to help create a helpful error-handling function that can prevent an eclipse.

Check one two

I’ve been building an ‘error helper’ function1 called check_class(). You put it inside another function to check if the user has supplied arguments of the expected class. Surprise.

I did this to provide richer, more informative error output compared to a simple if () stop(). But it has a few features I wanted to record here for my own reference.

In particular, check_class():

  • uses {rlang} to handle multiple inputs to the dots (...) argument
  • uses {rlang} to avoid an ‘error-handling eclipse’
  • uses {cli} for pretty error messaging
  • can handle multiple errors and build bulleted {cli} lists from them

Maybe you’ll find something useful here too.

A classy check

Examples

Here are some simple examples of inserting check_class() inside another function. The simple function add_one() expects a numeric value. What happens if you supply a string?

add_one <- function(x) {
  check_class(x, .expected_class = "numeric")
  x + 1
}

add_one("text")
Error in `add_one()`:
! `x` must be of class <numeric>
✖ You provided:
• `x` with class <character>

You get a user-friendly output that tells you the problem and where you went wrong. It doesn’t render in this blog post, but in supported consoles the output will be coloured and you’ll get a backtrace to say where the error occurred. See the image at the top of the post for an example.

You can provide an arbitrary number of values to check_class() for assessment. The example below takes three arguments that should all be numeric. What happens if we supply three objects of the wrong type?

multiply_and_raise <- function(x, y, z) {
  check_class(x, y, z, .expected_class = "numeric")
  x * y ^ z
}

multiply_and_raise(list(), data.frame(), matrix())
Error in `multiply_and_raise()`:
! `x`, `y`, and `z` must be of class <numeric>
✖ You provided:
• `x` with class <list>
• `y` with class <data.frame>
• `z` with class <c("matrix", "array")>

The output now shows each failure as separate bullet points so it’s clear where we made the error and what the problem was.

Function

Below is what the check_class() function actually looks like. I’ve added some comments to explain what’s happening at each step. For demo purposes, the function is equipped to check for numeric and character classes only, but you could expand the switch() statement for other classes too.

#' Check Class of Argument Inputs
#' @param ... Objects to be checked for class.
#' @param .expected_class Character. The name of the class against which objects
#'     should be checked.
#' @param .call The environment in which this function is to be called.
#' @noRd
check_class <- function(
    ...,
    .expected_class = c("numeric", "character"),
    .call = rlang::caller_env()
) {
  
  .expected_class = match.arg(.expected_class)  # ensures 'numeric'/'character'
  
  args <- rlang::dots_list(..., .named = TRUE)  # collect dots values
  
  # Check each value against expected class
  args_are_class <- lapply(
    args,
    function(arg) {
      switch(
        .expected_class,
        numeric   = is.numeric(arg),
        character = is.character(arg),
      )
    }
  )
  
  # Isolate values that have wrong class
  fails_names <- names(Filter(isFALSE, args_are_class))
  
  if (length(fails_names) > 0) {
    
    # Prepare variables with failure information
    fails <- args[names(args) %in% fails_names]
    fails_classes <- sapply(fails, class)
    
    # Build a bulleted {cli}-styled vector of the failures
    fails_bullets <- setNames(
      paste0(
        "{.var ", names(fails_classes), "} with class {.cls ",
        fails_classes, "}"
      ),
      rep("*", length(fails_classes))  # name with bullet point symbol
    )
    
    # Raise the error, printed nicely in {cli} style
    cli::cli_abort(
      message = c(
        "{.var {fails_names}} must be of class {.cls {(.expected_class)}}",
        x = "You provided:", fails_bullets
      ),
      call = .call  # environment of parent function, not check_class() itself
    )
  }
  
}

And now to explain in a bit more depth those features I mentioned.

Features

Ellipses

When a function has a dots (...) argument, it means you can pass an arbitrary number of objects to be captured. Consider paste("You", "smell") (two values), paste("You", "smell", "wonderful") (three), etc, or how you can provide an arbitrary number of column names to dplyr::select().

The first argument to check_class() is .... You pass to it as many values as you need to assess for an expected class. So the function add_one(x) would contain within it a call to check_class(x, .expected_class = "numeric") (one argument to check), while multiply(x, y) would accept check_class(x, y, .expected_class = "numeric") (two)2.

I’ve used the {rlang} package’s dots_list() function to collect the dots elements into a list. The .named = TRUE argument names each element, so we can pinpoint the errors and report them to the user.

I have collaborators, so readability of the code is important. I think rlang::dots_list() is more readable than the base approach, which is something like:

args <- list(...)
arg_names <- as.character(substitute(...()))
names(args) <- arg_names

Eclipses

So: you put check_class() inside another function. This causes a problem: errors will be reported to the user as having been raised by check_class(), but it’s an internal function that they’ll never see. It would be better to report the error has having originated from the parent function instead.

This obfuscation, this ‘code smell’, has been nicknamed an ‘error-handling eclipse’ by Nick Tierney, whose blog post was extremely well-timed for when I was writing check_class().

In short, you can record with rlang::caller_env() the environment in which the check_class() function was used. You can hand that to the call function of cli::cli_abort(), which check_class() uses to build and report error messages. This means the error is reported from the function enclosing check_class(), not from check_class() itself.

For example, here’s an example report_env() function, which prints the environment in which it’s called. Since this is being run in the global environment, the global environment will be printed.

remove(list = ls())  # clear the global environment
report_env <- function(env = rlang::caller_env()) rlang::env_print(env)
report_env()
<environment: global>
Parent: <environment: package:stats>
Bindings:
• report_env: <fn>
• .main: <fn>

If we nest report_env() inside another function then the reported environment is of that enclosing function (expressed here as its bytecode), which itself is nested in its parent (global) environment.

report_env_2 <- function() report_env()
report_env_2()
<environment: 0x10b25d0a0>
Parent: <environment: global>

See the image at the top of this post, which shows the backtrace as having originated from the enclosing add_one() function rather than the check_class() call within it.

Bulletses

The {cli} package lets you build rich user interfaces for your functions3. This is great for composing informative warning and error messages for the user.

Let’s focus on a simplified example of cli::cli_abort(), which is like the {cli} equivalent of stop(). Let’s pretend we passed a character vector when it should have been numeric.

To the message argument you provide a named vector, where the name will be printed as a symbol in the output. This will be a yellow exclamation point for cli_abort() by default, which draws attention to the exact error. The name ‘x’ prints as a red cross to indicate what the user did wrong.

You can also use {glue} syntax in {cli} to evaluate variables. But {cli} goes one further: it has special syntax to provide consistent mark-up to bits of text. For example, "{.var x}" will print with surrounding backticks and "{.cls numeric}" will print in blue with surrounding less/greater than symbols.

fail_class <- "character" 
cli::cli_abort(
  message = c(
    "{.var x} must be of class {.cls numeric}",
    x = "You provided class {.cls {fail_class}}"
  )
)
Error:
! `x` must be of class <numeric>
✖ You provided class <character>

Again, see an example in the image at the top of the post.

Since check_class() can take multiple values via the dots, we can construct an individual report for each failing element. {cli} will automatically turn each of these constructed lines into a bullet point in the printed output if we name them with an asterisk, which is pretty neat.

expected_class <- "numeric"
fails <- list(x = "character", y = "list")
fails_names <- names(fails)

fails_bullets <- setNames(
  paste0("{.var ", fails_names, "} with class {.cls ", fails, "}"),
  rep("*", length(fails))
)

cli::cli_abort(
  message = c(
    "{.var {fails_names}} must be of class {.cls {expected_class}}",
    x = "You provided:", fails_bullets
  )
)
Error:
! `x` and `y` must be of class <numeric>
✖ You provided:
• `x` with class <character>
• `y` with class <list>

Pew pew pew.

Test

Here’s a cheeky bonus if you’re wondering how to test for {cli} messages: you can use cli::test_that_cli() to test the output against an earlier snapshot.

cli::test_that_cli("prints expected error", {
  testthat::local_edition(3)  # only works with {testthat} 3e
  testthat::expect_snapshot({
    check_class(x = 1, y = "x", .expected_class = "numeric")
  })
})

Error-helper help?

Is this horribly overengineered? What is your approach to creating friendly and actionable error messages for your users?

Environment

Session info
Last rendered: 2024-04-10 08:54:28 BST
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.2.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: Europe/London
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] digest_0.6.35     utf8_1.2.4        fastmap_1.1.1     xfun_0.43        
 [5] glue_1.7.0        knitr_1.45        htmltools_0.5.8   rmarkdown_2.26   
 [9] lifecycle_1.0.4   cli_3.6.2         fansi_1.0.6       vctrs_0.6.5      
[13] compiler_4.3.1    rstudioapi_0.16.0 tools_4.3.1       evaluate_0.23    
[17] pillar_1.9.0      yaml_2.3.8        rlang_1.1.3       jsonlite_1.8.8   
[21] htmlwidgets_1.6.2

Footnotes

  1. {rlang} has some helpful documentation on error helpers, which you can find by typing ?rlang::`topic-error-call` into the console↩︎

  2. I’ve used a dot-prefix for the remaining check_class() arguments, which reduces the chance of a clash with user-supplied values to the dots. This is recommended in the Tidy Design Principles book.↩︎

  3. I wrote about {cli} in an earlier post, where I explored its ability to generate hyperlinks in the R console. I used it for fun (to build a choose-your-own-adventure in the console), but it can be useful for things like opening a file at the exact line where a test failure occurred.↩︎

Reuse

CC BY-NC-SA 4.0