Plot Caching

Creating plots in a Shiny application can take anywhere from a fraction of a second to multiple seconds. If there are multiple plots in an application they can be a significant source of perceived slowness. Improving the responsiveness of plots can greatly improve users’ experience of your application.

There are multiple ways you can improve the performance of your plots. For example, you could consider using R’s base graphics instead of a plotting package, using JavaScript graphics that render on the client instead of static plots that render on the server, or you could change the type of plot, e.g. switching from ggplot2::geom_point to ggplot2::geom_hex. However, there are cases where you might want to keep using the plotting code you already have.

As of Shiny 1.2.0, it is possible to cache plots with renderCachedPlot(). Plot caching can significantly improve the performance of your Shiny application with minimal code changes. Plot cachibg works by storing rendered plots in a cache so that, if the same plot is requested again, it can be drawn from the cache almost instantly. By default, the cache is shared among multiple users of an application. The more users your application has, the more performance benefits you’ll see from caching!

Example scenario: dashboard

Imagine a dashboard containing three plots which take a total of 3 seconds to render. With the usual renderPlot(), every user will have to wait 3 seconds to load the dashboard. If there are many concurrent users, then it might take even longer for some users, because some users might need to wait for others’ plots to finish rendering before theirs can be rendered. (This is assuming that the application is served with one R process; it is, however, possible to avoid this problem by using more R processes, a feature supported in Shiny Server Pro or RStudio Connect.)

If the dashboard switches to renderCachedPlot(), then the first user to visit will have to wait 3 seconds for the plots to render, but then every subsequent user will get the plots from the shared cache, which is almost instantaneous. The more users there are, the more likely it is for any given person to get a cache hit, and the the greater the average performance benefit will be.

Using cached plots

Usage is simple: in the most basic form, simply replace your renderPlot() with renderCachedPlot(), and add a cache key expression argument. For example, your server function might look like this:

function(input, output) {
  renderCachedPlot(
    {
      rownums <- seq_len(input$n)
      plot(cars$speed[rownums], cars$dist[rownums])
    },
    cacheKeyExpr = { input$n }
  )
}

In this case, the first time a particular of value input$n is seen, Shiny will render the plot and store it in the cache. If it changes to another value and then back again, instead of re-executing the plotting code, it will simply get the saved plot from the cache.

Click here to see full app code
library(shiny)
shinyApp(
  fluidPage(
    sidebarLayout(
      sidebarPanel(
        sliderInput("n", "Number of points", 4, 32, value = 8, step = 4)
      ),
      mainPanel(plotOutput("plot"))
    )
  ),
  function(input, output, session) {
    output$plot <- renderCachedPlot(
      {
        Sys.sleep(2)  # Add an artificial delay
        rownums <- seq_len(input$n)
        plot(cars$speed[rownums], cars$dist[rownums],
            xlim = range(cars$speed), ylim = range(cars$dist))
      },
      cacheKeyExpr = {input$n }
    )
  }
)

You can see the application in action below (or here).

In this example, there are two expressions given to renderCachedPlot(). The first expression contains the code that generates a plot. Unlike the plotting expression that’s used in a regular renderPlot(), this expression does not take any reactive dependencies – it will not, by itself, automatically re-execute. That’s where the second expression comes in. You can, of course, reference user inputs in this expression, it is only the affect on the reactive dependency graph that is different.

The second expression, cacheKeyExpr, serves two purposes. The first is that it sets up reactive dependencies: whenever any reactive expressions or reactive values in that expression change, it causes the plotting expression to re-execute – with a big exception, which we’ll see soon. (If you’ve used eventReactive() or observeEvent(), this is similar the eventExpr that they have.) In technical terms, when cacheKeyExpr is invalidated, it causes the plotting expression to re-execute.

The other use of cacheKeyExpr is, not surprisingly, for caching. When the plot expression is executed, the resulting plot is stored in a cache, using the result from cacheKeyExpr as the cache key. (Techincal note: the value from cacheKeyExpr is actually serialized and then hashed, and the resulting hash value is used as the key.) Whenever the cacheKeyExpr is invalidated and re-executed, Shiny looks first looks in the cache to see if there’s a previously-saved plot. If there is, then the saved plot is sent to the client web browser, and the plotting expression does not need to re-execute. If there is not, then it re-executes the plotting expression, caches the resulting plot, and sends it to the browser.

The cache key expression

In the example above, the cache key expression only contains one thing, input$n. It can, however, contain multiple values, if you simply combine them in a list. For example, in addition to user input values that can change, you may have a data set that can change. The cacheKeyExpr might look like this:

  output$plot <- renderCachedPlot(
    { 
      # Plotting code here...
    },
    cacheKeyExpr = { list(input$n, dataset()) }
  )
Click here to see full app
library(shiny)
dataset <- reactiveVal(data.frame(x = rnorm(400), y = rnorm(400)))

ui <- fluidPage(
  sidebarLayout(
    sidebarPanel(
      sliderInput("n", "Number of points to display", 50, 400, 100, step = 50),
      actionButton("newdata", "Generate new data")
    ),
    mainPanel(
      plotOutput("plot")
    )
  )
)

server <- function(input, output, session) {
  # When the newdata button is clicked, change the data set to new random data
  observeEvent(input$newdata, {
    dataset(data.frame(x = rnorm(400), y = rnorm(400)))
  })

  output$plot <- renderCachedPlot(
    {
      Sys.sleep(2)     # Add an artificial delay
      d <- dataset()
      rownums <- seq_len(input$n)
      plot(d$x[rownums], d$y[rownums], xlim = range(d$x), ylim = range(d$y))
    },
    cacheKeyExpr = { 
      list(input$n, dataset())
    }
  )
}

shinyApp(ui, server)

You can see the application in action below (or here).

When creating the cacheKeyExpr, keep in mind that the entire return value is serialized and hashed. If you are hashing a large object, this may take a non-negligible amount of time. In this example, hashing the data object is still very fast: about 1 millisecond on a modern machine. You can test this with the code below:

d <- data.frame(x = rnorm(400), y = rnorm(400))
system.time(digest::digest(d, "xxhash64"))
#>    user  system elapsed 
#>   0.000   0.000   0.001 

(Note: The hashing algorithm used in renderCachedPlot is xxHash-64.)

Even a larger data set like diamonds from the ggplot2 package (which has 10 columns and 53,940 rows) is reasonably fast, at about 14 milliseconds. Hashing this data is much faster than rendering a scatter plot of it. See this FAQ to learn more about measuring time spent computing hashes.

There are still some cases where it may be useful to preprocess an object before it used as a cache key (where it will be serialized and hashed). If, for example, your data lives on a remote database, it will probably not be efficient to pull the data and hash it. But you can do other things that work just as well for the purpose of caching: for example, if you can query the database for a timestamp of the most recent update, you could use the timestamp in the cache key, instead of the data itself. So your cacheKeyExpr might look like this, where mostRecentChange() is some function that returns the timestamp:

  cacheKeyExpr = {
    timestamp <- mostRecentChange(db)
    list(input$n, timestamp)
  }

Note that the value returned by the cacheKeyExpr is the one that is serialized and hashed. In this case, the return value is the last line of the cacheKeyExpr. Any code above is executed, but is not used as part of the key.

One more detail about the cacheKeyExpr: it is only part of the actual cache key that’s used. The actual cache key also adds the name of the output ("plot" in this case) and the width and height of the plot. Those things are combined with cacheKeyExpr in a list; then that list is hashed and the result is used as the actual cache key. So if you want to have two separate outputs (say, output$plot1 and output$plot2) to share cached plots among them, it is not possible because the different output names are used in the cache keys.

Plot sizing

Sizing for cached plots works a bit differently from regular plots: with regular plots, the plot is rendered to exactly fit the div on the web page; with cached plots, the plot is rendered to a close-fitting size, and scaled to fit the div on the web page.

With renderPlot(), the plot is rendered at exactly the dimensions of the div containing the image in the browser. If the div is 500 pixels wide and 400 pixels tall, then it will create a plot that is exactly 500×400 pixels. If you resize the window, and the div then becomes 550x400 pixels (typically the width is variable, but the height is fixed), then Shiny will render another plot that is 550x400, which can take some time.

With renderCachedPlot(), the plot is not rendered to be an exact fit. There are a number of possible sizes, and Shiny will render the plot to be the closest size that is larger than the div on the web page, and cache it. For example, possible widths include 400, 480, 576, 691, and so on, both smaller and larger; each width is 20% larger than the previous one. Heights work the same way.

If the width of the div is 450 pixels, then Shiny will render a plot that is 480 pixels wide and scale it down to fit the 450 pixel wide div. If the div is then resized to 500 pixels, then Shiny will render a plot that is 576 pixels wide.

The reason that renderCachedPlot() works this way is so that it doesn’t have to cache a plot of every possible size; doing that would greatly reduce the usefuless of caching, since each browser would likely have a slightly different width, and so there would be very few cache hits.

This behavior is controlled by the sizePolicy parameter – it is a function that takes two numbers (the actual dimensions of the div) and returns two numbers (the dimensions of the plot that will be rendered). If you want to use a different strategy, you can pass in a different function.

Cache scoping

By default, a cached plot will be shared among all the browser sessions connected to a single R process running Shiny. It is possible to change this option so that the cached plot is used only within one session. There are also options to allow multiple R processes to share the same cache, see the section below on disk caching.

In most cases, you will want to use the default app-level cache scoping, because the performance benefits are shared across multiple users: the more people who use the the app, the greater the performance benefit is. The main reason you would want to scope the cache to a session is for privacy: it can reveal information if a user visits a plot and finds that it renders very quickly – specifically that someone else has seen that same plot before. However, if this is a potential problem, then it might be more appropriate to use a regular non-cached plot.

To control the scope of the cache, use cache="app" or cache="session":

renderCachedPlot(..., cache = "app")

# Or
renderCachedPlot(..., cache = "session")

If either “app” or “session” is used, the cache will be 10 MB in size, and will be stored stored in memory, using an object created by memoryCache(). The default size of 10 MB can hold plenty of plots – a cached plot object is typically between 50 and 250 kB.

Finer control over caching

The "app" and "session" options for cache scoping offer a simple way to control cache scoping, but in some cases you may want to have finer control over the caching.

In some cases, you may want more control over the caching behavior. For example, you may want to use a larger or smaller cache, share a cache among multiple R processes, or you may want the cache to persist across multiple runs of an application, or even across multiple R processes. The backing store for the cache can be in-memory, on disk, or you can even use a database like Redis, SQLite, or mySQL.

To use different settings for an application-scoped cache, you can call shinyOptions() at the top of your app.R, server.R, or global.R. For example, this will create a cache with 20 MB of space instead of the default 10 MB. (Note that if you copy and paste this in your R console instead of running it in app.R/server.R, it will set the default cache for Shiny apps until your R session exits, due to how scoping for shinyOptions works.)

# At the top of app.R, server.R, or global.R
shinyOptions(cache = memoryCache(size = 20e6))

To use different settings for a session-scoped cache, you can call shinyOptions() at the top of your server function. To use the session-scoped cache, you must also call renderCachedPlot with scope="session". This will create a 20 MB cache for the session:

function(input, output, session) {
  shinyOptions(cache = memoryCache(size = 20e6))
  
  output$plot <- renderCachedPlot(
    ...,
    cache = "session"
  )
}

If you want to create a memory cache that is used by one particular plot, you can pass it directly to the renderCachedPlot() call:

mcache <- memoryCache(size = 20e6)

function(input, output, session) {
  
  output$plot <- renderCachedPlot(
    ...,
    cache = mcache
  )
}

In the example above, the mcache is shared among all sessions, but just for this one plot. Any other cached plots would use the default app-level cache.

If you want to create a cache that is shared across multiple concurrent R processes, you can use a diskCache You can create an shared disk cache by putting this at the top of your app.R, server.R, or global.R:

# At the top of app.R, server.R, or global.R
shinyOptions(cache = diskCache(file.path(dirname(tempdir()), "myapp-cache")))

This will create a subdirectory in your system temp directory named myapp-cache (replace myapp-cache with a unique name of your choosing). This directory will be a shared cache among all the R processes that choose to use it. On most platforms, this directory will be removed when your system reboots. This cache will persist across multiple starts and stops of the R process, as long as you do not reboot.

To have the cache persist even across multiple reboots, you can create the cache in a location outside of the temp directory. For example, it could be a subdirectory of the application:

# At the top of app.R, server.R, or global.R
shinyOptions(cache = diskCache("./myapp-cache"))

In this case, resetting the cache will have to be done manually, by deleting the directory.

You can also scope a cache to just one plot, or selected plots. To do that, create a memoryCache or diskCache, and pass it as the cache argument of renderCachedPlot().

Cache Scoping in RStudio Connect

RStudio Connect includes support for running multiple R process for an application. For apps deployed to RStudio Connect, the best practice is to use a disk cache located in the application directory. To do so, add a line like this to your application:

# At the top of app.R, server.R, or global.R
shinyOptions(cache = diskCache("./myapp-cache"))

RStudio Connect automatically deletes the cache if the application is redeployed, ensuring new verisons of the application do not inherit a stale cache.

FAQ

How long does it take to hash my cacheKeyExpr?

If you are concerned about the speed of hashing your cache key, you can test it out by using system.time(), along with digest(). The digest() used in renderCachedPlot uses the xxHash64 algorithm. For example, here is the time it takes to hash a data frame with 2 columns and 400 rows:

d <- data.frame(x = rnorm(400), y = rnorm(400))
system.time(digest::digest(d, "xxhash64"))
#>   user  system elapsed 
#>  0.000   0.000   0.001 

The total time is about 1 millisecond.

Antoher example: the diamonds data set from ggplot2 contains 10 columns and about 54,000 rows:

system.time(digest::digest(ggplot2::diamonds, "xxhash64"))
#>   user  system elapsed 
#>  0.013   0.002   0.014

This takes about 45 milliseconds, which is generally much faster than creating a plot of the same data.

Why is my diskCache giving me a warning about reference objects?

If you use a diskCache, you may in some cases see a warning like this:

Warning message:
In d$set(key, value) :
  A reference object was cached in a serialized format. The restored object may not work as expected.

When a diskCache stores an R object, it serializes the object and then saves the serialized data to disk. However, reference objects such as environments and external pointers cannot be guaranteed to restore exactly the same as when they are stored, especially if they are restored in another R session. (A memoryCache does not serialize R objects, and so does not give this warning.)

The warning will be shown when caching an environment, as below:

dc <- diskCache()
e <- new.env()
dc$set("x", e)
#> Warning message:
#> In dc$set("x", e) :
#>   A reference object was cached in a serialized format. The restored object may not work as expected.

But other objects also can contain environments. Functions, for instance, consist of formals (the function parameters), body, and environment, and so serializing a function will result in the same warning.

f <- function() 1+1
dc$set("x", f)
#> Warning message:
#> In dc$set("x", f) :
#>   A reference object was cached in a serialized format. The restored object may not work as expected.

A reactive expression from Shiny is a special function, and it will result in the warning as well:

r <- reactive(1)
dc$set("x", r)
#> Warning message:
#> In dc$set("x", r) :
#>   A reference object was cached in a serialized format. The restored object may not work as expected.

In the context of a renderCachedPlot(), this warning can help you detect accidental usage of a reactive expression in the cache key, when you actually intended to use the result of a reactive expression in the cache key.

# r is some reactive expression
r <- reactive(...)


# Bad: Using reactive expression in cacheKeyExpr. This will raise a warning.
renderCachedPlot(...,
  cacheKeyExpr = { list(r) }
)


# Good: Using the value of reactive expression in cacheKeyExpr.
renderCachedPlot(...,
  cacheKeyExpr = { list(r()) }
)

Can I customize the behavior of memoryCache and diskCache?

The memoryCache and diskCache provide many options to customize their behavior. The default behavior for both is to set a maximum size for the cache and to use a LRU (least-recently used) eviction policy. However, it is also possible to expire objects after a certain amount of time, by setting the max_age.

For example, if you want to use an app-level memory cache with unlimited size, but where the objects expire after five minutes, you would do the following at the top of your app.R/server.R:

shinyOptions(cache = memoryCache(max_size = Inf, max_age = 300))

Can I write my own caching backend?

It is possible to use a caching backend other than Shiny’s built-in memoryCache and diskCache. To do this, you simply need to supply a caching object – that is, an object that has $get() and $set() methods on it. In this section, we will see how to create a cache object that can be used by Shiny.

Here is a function that creates an extremely simple caching object:

createSimpleCache <- function() {
  e <- new.env(parent = emptyenv())
  
  list(
    get = function(key) {
      if (exists(key, envir = e, inherits = FALSE)) {
        return(e[[key]])
      } else {
        return(key_missing())
      }
    },
    set = function(key, value) {
      e[[key]] <- value
    }
  )
}

To use it:

sc <- createSimpleCache()

sc$set("abc", 123)
sc$get("abc")
#> [1] 123

# Calling $get() on a missing key returns a key_missing() object
sc$get("xyz")
#> <Key Missing>

The object returned by createSimpleCache() does all the things that a Shiny application needs from a cache. The requirements are very simple:

  • There is a $set(key, value) method which takes a string key, and any R object as a value.
  • There is a $get(key) method which returns a cached value, and if the value is not present, returns a key_missing() object.

Our example is a simple memory cache – unlike Shiny’s memoryCache(), it doesn’t do any pruning, and so it will just keep on growing arbitrarily large. But it does the job for this example.

If you put the following at the top of app.R, server.R, or global.R, it will use one of these simple cache objects as the default application-level cache.

shinyOptions(cache = createSimpleCache())

Similarly, you can put it in the server function to serve as the session-level cache:

function(input, output, session) {
  shinyOptions(cache = createSimpleCache())

  # ... more code here
}

Finally, you can also create the cache object and pass it directly to a renderCachedPlot(). As described in the previous section, the cache would be shared among all Shiny sessions connected to this R process, but other plots in the app would use the default app-level cache.

simple_cache <- createSimpleCache()

function(input, output, session) {
  
  output$plot <- renderCachedPlot(
    ...,
    cache = simple_cache
  )
}

Can I use Redis for the cache?

In the previous section, we saw how to create a simple cache. Here we’ll create a cache that uses a local Redis store as the backend. In this case, we’ll use an R6 class for the caching objects, instead of a list.

library(shiny)
library(redux)
library(R6)

RedisCache <- R6Class("RedisCache",
  public = list(
    initialize = function(..., namespace = NULL) {
      private$r <- redux::hiredis(...)
      # Configure redis as a cache with a 20 MB capacity
      private$r$CONFIG_SET("maxmemory", "20mb")
      private$r$CONFIG_SET("maxmemory-policy", "allkeys-lru")
      private$namespace <- namespace
    },
    get = function(key) {
      key <- paste0(private$namespace, "-", key)
      s_value <- private$r$GET(key)
      if (is.null(s_value)) {
        return(key_missing())
      }
      unserialize(s_value)
    },
    set = function(key, value) {
      key <- paste0(private$namespace, "-", key)
      s_value <- serialize(value, NULL)
      private$r$SET(key, s_value)
    }
  ),
  private = list(
    r = NULL,
    namespace = NULL
  )
)

Redis provides a key-value store, and our RedisCache objects can have a namespace – which is simply implemented as a prefix to the keys. So of the namespace is "myapp" and you store an object with the key "abc", then it will be stored in Redis with the namespaced key "myapp-abc". This allows you to share the same Redis store among multiple different Shiny applications, each with its own namespace.

Before we create a cache, we need to start up a local Redis server. On a Mac, you can do this by running the following from the command line:

# Install redis via Homebrew
brew install redis

redis-server /usr/local/etc/redis.conf

Now that Redis is running, we can create a RedisCache object and test it out:

rc <- RedisCache$new(namespace = "test")

rc$set("abc", 123)
rc$get("abc")
#> [1] 123

# Getting a key that's not present
rc$get("xyz")
#> <Key Missing>

This example is the same as the earlier ones, except that it uses a Redis cache as the default application-level cache. (Note once again that if you run shinyOptions(cache=...) from the console instead of in an app.R/server.R, it will set the default cache for the rest of the R session instead of for the duration of the application.)

## app.R ##
shinyOptions(cache = RedisCache$new(namespace = "myapp"))

shinyApp(
  fluidPage(
    sidebarLayout(
      sidebarPanel(
        sliderInput("n", "Number of points", 4, 32, value = 8, step = 4)
      ),
      mainPanel(plotOutput("plot"))
    )
  ),
  function(input, output, session) {
    output$plot <- renderCachedPlot({
        Sys.sleep(2)  # Add an artificial delay
        seqn <- seq_len(input$n)
        plot(mtcars$wt[seqn], mtcars$mpg[seqn],
             xlim = range(mtcars$wt), ylim = range(mtcars$mpg))
      },
      cacheKeyExpr = { list(input$n) }
      # Another alternative: set the cache just for this plot.
      #, cache = RedisCache$new(namespace = "myapp-plot")
    )
  }
)

The Redis cache can be shared among multiple R processes running the same app, as long as they point to the same Redis cache and use the same namespace (in this case, "myapp").

An alternative is to use the Redis cache just for a specific plot. In the code above, the renderCachedPlot() call has a commented-out cache argument. If you uncomment the cache argument (instead of calling shinyOptions() above) then that plot will use the Redis cache, with the namespace "myapp-plot". Multiple instances of this application, or multiple different applications could share the Redis store using the same namespace.

In the RedisCache class defined above, the Redis store was configured to behave as a cache with a 20 MB capacity and a LRU (least-recently-used) expiration policy, by the following lines:

  private$r$CONFIG_SET("maxmemory", "20mb")
  private$r$CONFIG_SET("maxmemory-policy", "allkeys-lru")

If the Redis store is shared among applications, it may not be a good idea to configure Redis from R, because one application may set certain settings, and then another application may change them. In these cases, it is advisable to configure the Redis store using the Redis configuration file. In production, you should also use a more secure configuration, including a password for Redis.

Can I use storr for the caching backend?

The storr package provides a consistent key-value interface for several different kinds of backends, including SQLite, Postgres, R environments, RDS files, and more.

It’s possible to wrap a storr object so that it behaves in a way that is compatible with what Shiny’s caching objects expect. The main difference is that, when Shiny’s memoryCache or diskCache try to retrieve a key that is not present, they return a key_missing object, while storr throws a KeyError.

In the code below, the keyMissingCacheDecorator takes a storr object, and wraps it so that when the get() method encounters a missing key, it returns a key_missing object instead of throwing a KeyError.

keyMissingCacheDecorator <- function(cacheObj, missing = key_missing()) {
  force(missing)

  list(
    get = function(key) {
      is_missing <- FALSE
      tryCatch(
        { val <- cacheObj$get(key) },
        KeyError = function(e) {
          is_missing <<- TRUE
        }
      )
      
      if (is_missing) {
        return(missing)
      }
      
      val
    },
    
    set = cacheObj$set
  )
}

To test it out, we can use a storr cache that is backed by an R environment – this is similar to a memoryCache(), but without pruning.

library(storr)

# Create the storr cache
se <- storr_environment()

# Wrap it in the decorator
se_cache <- keyMissingCacheDecorator(se)

# Test it out
se_cache$set('x',123)
se_cache$get('x')
#> [1] 123

se_cache$get('abc')
#> <Key Missing>

The keyMissingCacheDecorator can be used with any storr cache, not just the environment-backed cache used in the example above.



If you have questions about this article or would like to discuss ideas presented here, please post on RStudio Community. Our developers monitor these forums and answer questions periodically. See help for more help with all things Shiny.


Start
Build
Improve