Using tryCatch() (within simulations)

R tricks R programming Simulation

I had previously found the structure of tryCatch() difficult: cracked it!

Chris Evans https://www.psyctc.org/R_blog/ (PSYCTC.org)https://www.psyctc.org/psyctc/
2023-11-08
Show code
library(ggplot2)
library(tidyverse)
as_tibble(list(x = 1,
               y = 1)) -> tmpDat

# png(file = "tryCatch.png", type = "cairo", width = 6000, height = 4800, res = 300)
ggplot(data = tmpDat,
       aes(x = x, y = y)) +
  geom_text(label = "tryCatch()",
            size = 32,
            colour = "red",
            angle = 30) +
  xlab("") +
  ylab("") +
  theme_bw() +
  theme(panel.grid.major = element_blank(),
        panel.grid.minor = element_blank(),
        panel.border = element_blank(),
        panel.background = element_blank(),
        axis.line = element_blank(),
        axis.text = element_blank(),
        axis.ticks = element_blank()) 
Show code
# dev.off()

[updated to improve clarity, I hope, 9.xi.23]

What is tryCatch()

The tryCatch() function is part of base R and provides a wrapper around another function that, as the name says, tries that function and then catches any errors or warnings when that function is run.

Why this post?!

Because I don’t find the help for tryCatch() very helpful at all, to me, like quite a bit that I found when I searched the web about tryCatch(), it seemed to be written by people who understood it well and for people who are much more immersed in some of the more subtle and advanced aspects of R than I use. So this is partly to remind me of what I’ve just taught myself, which I will forget as I don’t use tryCatch() all that often. I hope it will also be useful to others.

Why use tryCatch()?

Minor option

I can see that I might want to use this to put different error or warning messages in some of my CECPfuns package of functions. The package is designed to make it easier for therapy and mental health practitioners and researchers who are not statisticians or R experts to use R’s enormous strengths. Some R functions, however amazing, can return errors and warnings that can be confusing for the my user group (and me!) Perhaps I should look through the other R functions I call in the package to see if some might throw warnings or errors that I could make more user friendly by wrapping them in a tryCatch(). Not an immediate priority.

Main use for me

My main need for tryCatch() comes when I am creating simulations as sometimes these hit data that cannot be analysed and which then returns an error. Unless I trap the error it will crash the simulation: frustrating if it’s been running for hours and is on the 997th iteration of a planned 1,000! What I need, as I suspect do others who run simulations, is for the simulation to absorb the error, return missing values for whatever return value that function was going to return. That way the simulation can keep running. This is exactly what tryCatch() does!

(This “use case”, as the IT people say, means that in this little post I’m only covering how tryCatch() handles errors, not warnings. A warning coming back during a simulation run doesn’t worry me as it won’t crash the run. There is a lot more that tryCatch() can do that I am also skipping as those things aren’t pertinent for me.)

Very simple example

Hm, it’s surprisingly difficult to find really simple functions in R that don’t actually already do sensible things with mad inputs, usually already returning NA or NaN (“Not a Number”), and outputting a warning. Those wouldn’t crash a simulation.

Trust me, complicated analyses such some psychometric and bootstrap ones can, not unreasonably, just return an error and stop on data coming from simulations (and real data)! OK, so here is a silly example, suppose my function wants to get the 5th and 95% quantiles of the first principal component of simulated repertory grids with the simulated grids having the same numbers of elements and constructs and the same scoring range

Here is a function to do just that. (I have written the function to show the first simulated grid so you see what’s happening.)

Show code
simulateGridPC1v1 <- function(nCons, nElem, minScore, maxScore, nSims) {
  ### vector to store PCs
  vecPCs <- rep(NA, nSims)
  for (i in 1:nSims) {
    tmpMat <- matrix(sample(minScore:maxScore, nCons * nElem, replace = TRUE), nrow = nCons)
    if (i == 1){
      ### print the first simulated grid to give the idea of the simulation
      print("Here is the first simulated grid")
      print(tmpMat)
      print(" ")
    }
    ### get the first principal component of that simulated grid
    ### yes, I used princomp() with cor = TRUE to make problems likely!
    vecPCs[i] <- princomp(tmpMat, cor = TRUE)$sdev[1]
  }
  print("Here are the first PC values:")
  print(vecPCs)
  ### but we just want the 5th and 95th quantiles of the first principal components
  ### of the nSims simulated grids:
  print("Finally the quantiles:")
  quantile(vecPCs, c(.05, .95))
}
set.seed(1234) # make simulation replicable
simulateGridPC1v1(5, 5, 0, 2, 10) # 5 constructs but only 5 elements and 0, 1, 2 scoring
[1] "Here is the first simulated grid"
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    0    1    1    2
[2,]    1    1    1    1    0
[3,]    0    1    2    0    1
[4,]    2    2    1    2    0
[5,]    0    1    1    2    1
[1] " "
[1] "Here are the first PC values:"
 [1] 1.652448 1.544131 1.616389 1.409200 1.637459 1.647444 1.564582
 [8] 1.620288 1.644854 1.679319
[1] "Finally the quantiles:"
      5%      95% 
1.469919 1.667227 

So that worked for that simulation run with only 10 iterations. However, we clearly need more iterations for sensible estimation of those quantiles, let’s go to 10,000 iterations.

set.seed(12345) # replicability (in principle)
simulateGridPC1v1(5, 5, 0, 2, 10000)

But that just gives this:

Quitting from lines 84-86 [simulation2] (trycatch.Rmd)
Error in `princomp.default()`:
! cannot use 'cor = TRUE' with a constant variable
Backtrace:
 1. global simulateGridPC1v1(5, 5, 0, 2, 10000)
 3. stats:::princomp.default(tmpMat, cor = TRUE)
Execution halted

So that crashed on some iteration of that function and it crashed because one of the elements came up with no variance, i.e. all the randomly simulated scores on that element were 1 (or were all 0).

Use tryCatch() to prevent that crashing

What I’ve done is to write a little function tryGetPC1() to replace the call princomp(tmpMat, cor = TRUE)$sdev[1].

Show code
tryGetPC1 <- function(tmpMat) {
  ### function takes as input a simulated grid, tmpMat
  ### create a tryCatch function
  ### it is a function but to me the structure/semantics throw me as they feel unusual for R functions
  retVal <- tryCatch(
    ### in effect this first bit of code is the predefined first argument to tryCatch
    ### it codes what will happen if this bit of code runs with no errors or warnings
    {
      ### if this call to princomp() runs without error the function will just return the answer from that call
      return(princomp(tmpMat, cor = TRUE)$sdev[1])
      ### however ...
    },
    ### that comma at the end of that line takes us into the next argument which always starts "error = "
    ### this says what to do if the call in the first argument produced an error, so this is the bit of
    ### tryCatch() that I need for simulations
    ### the little function says what to do if that call failed with an error
    error = function() { 
      ### that's really simple here, just return a single NA
      ### my real life simulations sometimes need to return something more complex containing NA values
      ### rather than just a single NA, more on that later in the post
      return(NA)
    }
    ### we could have another comma and a new line saying what to do if the call ran but threw a warning
    ### but I don't need that
  )
}

OK, we now put that into the simulation, replacing the simple call to princomp()

Show code
### now I embed that in the simulation function
simulateGridPC1v2 <- function(nCons, nElem, minScore, maxScore, nSims) {
  ### vector to store PCs
  vecPCs <- rep(NA, nSims)
  for (i in 1:nSims) {
    ### as before, simulate a grid
    tmpMat <- matrix(sample(minScore:maxScore, nCons * nElem, replace = TRUE), nrow = nCons)
    ### but now use the tryCatch() function I created above
    ### this will return the first PC for the grid if princomp() can compute that
    ### but instead of throwing an error and stopping, it will just return NA and 
    ### allow everything to continue on if princomp() does trigger an error
    vecPCs[i] <- tryGetPC1(tmpMat)
  }
  ### but we just want the 5th and 95th quantiles of the first principal components
  ### of the nSims simulated grids:
  ### need to allow missing values in quantile now
  return(cat("That simulation run gave:\n   n(unusuable simulations) = ", sum(is.na(vecPCs)),
             "\n   n(usable simulations) = ", sum(!is.na(vecPCs)),
             "\nand quantiles:",
             "\n   q05 = ", round(quantile(vecPCs, .05, na.rm = TRUE), 4),
             "\n   q95 = ", round(quantile(vecPCs, .95, na.rm = TRUE), 4)))
}
set.seed(1234) # make simulation replicable
### this should work as we know, from using simulateGridPC1v1 that these five simulations 
### don't throw errors
simulateGridPC1v2(5, 5, 0, 2, 10) # 5 constructs but only 5 elements and 0, 1, 2 scoring
That simulation run gave:
   n(unusuable simulations) =  0 
   n(usable simulations) =  10 
and quantiles: 
   q05 =  1.4699 
   q95 =  1.6672

But now we can handle simulated grids that can’t be crunched to give a first PC:

Show code
set.seed(12345) # replicability (in principle)
simulateGridPC1v2(5, 5, 0, 2, 10000)
That simulation run gave:
   n(unusuable simulations) =  608 
   n(usable simulations) =  9392 
and quantiles: 
   q05 =  1.4265 
   q95 =  1.8349

So we can see that 608 of those simulations would have crashed the run but with the tryCatch() the error is caught and the PC is replaced with NA.

What if the return value is more complex than a single number?

I needed to use this for simulation involving bootstrap CI estimates and which return something more complex than a single value. I realised that there are two sensible ways to handle that:

Here’s an example extracting the bits of a percentile bootstrap CI call:

 ### first the percentile method
  getPercCI <- function(tmpBootRes, conf = conf){
    tryCatch({
        tmpCI <- boot::boot.ci(tmpBootRes, type = 'perc', conf = conf)
        ### here just pick the bits you want to return
        return(list(percLCLCSC = tmpCI$percent[4],
                    percUCLCSC = tmpCI$percent[5]))
      },
      error = function(tmp) {
        tmp <- NA
        ### return the same structured named list but with NA values
        return(list(percLCLCSC = NA,
                    percUCLCSC = NA))
      }
    )
  }

The umbrella issue

That’s it. I suspect I will need to come back to this every time I write a new simulation now. Or perhaps, now I have created the post I will, for the first time, remember how tryCatch() works and how to use it and never need to come back here. That’s what I call the umbrella issue: if I take one out with me, it won’t rain!

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-SA 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Evans (2023, Nov. 8). Chris (Evans) R SAFAQ: Using tryCatch() (within simulations). Retrieved from https://www.psyctc.org/R_blog/posts/2023-11-08-trycatch/

BibTeX citation

@misc{evans2023using,
  author = {Evans, Chris},
  title = {Chris (Evans) R SAFAQ: Using tryCatch() (within simulations)},
  url = {https://www.psyctc.org/R_blog/posts/2023-11-08-trycatch/},
  year = {2023}
}