1.2 Extending ggplotly()

1.2.1 Customizing the layout

Since the ggplotly() function returns a plotly object, we can manipulate that object in the same way that we would manipulate any other plotly object. A simple and useful application of this is to specify interaction modes, like plotly.js’ layout.dragmode for specifying the mode of click+drag events. Figure 1.6 demonstrates how the default for this attribute can be modified via the layout() function.

p <- ggplot(fortify(gold), aes(x, y)) + geom_line()
gg <- ggplotly(p)
layout(gg, dragmode = "pan")

Figure 1.6: Customizing the dragmode of an interactive ggplot2 graph.

Perhaps a more useful application is to add a range slider to the x-axis, which allows you to zoom on the x-axis, without losing the global context. This is quite useful for quickly altering the limits of your plot to achieve an optimal aspect ratio for your data (William S. Cleveland 1988), without losing the global perspective. Figure 1.7 uses the rangeslider() function to add a rangeslider to the plot.


Figure 1.7: Adding a rangeslider to an interactive ggplot2 graph.

Since a single plotly object can only have one layout, modifying the layout of ggplotly() is fairly easy, but it’s trickier to add and modify layers.

1.2.2 Modifying layers

As mentioned previously, ggplotly() translates each ggplot2 layer into one or more plotly.js traces. In this translation, it is forced to make a number of assumptions about trace attribute values that may or may not be appropriate for the use case. The style() function is useful in this scenario, as it provides a way to modify trace attribute values in a plotly object. Before using it, you may want to inspect the actual traces in a given plotly object using the plotly_json() function. This function uses the listviewer package to display a convenient interactive view of the JSON object sent to plotly.js (de Jong and Russell 2016). By clicking on the arrow next to the data element, you can see the traces (data) behind the plot. In this case, we have three traces: one for the geom_point() layer and two for the geom_smooth() layer.


Figure 1.8: Using listviewer to inspect the JSON representation of a plotly object.

Say, for example, we’d like to display information when hovering over points, but not when hovering over the fitted values or error bounds. The ggplot2 API has no semantics for making this distinction, but this is easily done in plotly.js by setting the hoverinfo attribute to "none". Since the fitted values or error bounds are contained in the second and third traces, we can hide the information on just these traces using the traces attribute in the style() function. Generally speaking, the style() function is designed modify attribute values of trace(s) within a plotly object, which is primarily useful for customizing defaults produced via ggplotly().

style(p, hoverinfo = "none", traces = 2:3)

Figure 1.9: Using the style() function to modify hoverinfo attribute values of a plotly object created via ggplotly() (by default, ggplotly() displays hoverinfo for all traces). In this case, the hoverinfo for a fitted line and error bounds are hidden.

1.2.3 Leveraging statistical output

Since ggplotly() returns a plotly object, and plotly objects can have data attached to them, it attaches data from ggplot2 layer(s) (either before or after summary statistics have been applied). Furthermore, since each ggplot layer owns a data frame, it is useful to have some way to specify the particular layer of data of interest, which is done via the layerData argument in ggplotly(). Also, when a particular layer applies a summary statistic (e.g., geom_bin()), or applies a statistical model (e.g., geom_smooth()) to the data, it might be useful to access the output of that transformation, which is the point of the originalData argument in ggplotly().

p <- ggplot(mtcars, aes(x = wt, y = mpg)) +
   geom_point() + geom_smooth()
p %>%
  ggplotly(layerData = 2, originalData = FALSE) %>%
#> # A tibble: 80 × 13
#>       x     y  ymin  ymax    se  PANEL group  colour   fill  size linetype
#> * <dbl> <dbl> <dbl> <dbl> <dbl> <fctr> <int>   <chr>  <chr> <dbl>    <dbl>
#> 1  1.51  32.1  28.1  36.0  1.92      1    -1 #3366FF grey60     1        1
#> 2  1.56  31.7  28.2  35.2  1.72      1    -1 #3366FF grey60     1        1
#> 3  1.61  31.3  28.1  34.5  1.54      1    -1 #3366FF grey60     1        1
#> 4  1.66  30.9  28.0  33.7  1.39      1    -1 #3366FF grey60     1        1
#> 5  1.71  30.5  27.9  33.0  1.26      1    -1 #3366FF grey60     1        1
#> 6  1.76  30.0  27.7  32.4  1.16      1    -1 #3366FF grey60     1        1
#> # ... with 74 more rows, and 2 more variables: weight <dbl>, alpha <dbl>

The data shown above is the data ggplot2 uses to actually draw the fitted values (as a line) and standard error bounds (as a ribbon). Figure 1.10 leverages this data to add additional information about the model fit; in particular, it adds a vertical lines and annotations at the x-values that are associated with the highest and lowest amount uncertainty in the fitted values. Producing a plot like this with ggplot2 would be impossible using geom_smooth() alone.5 Providing a simple visual clue like this can help combat visual misperceptions of uncertainty bands due to the sine illusion (VanderPlas and Hofmann 2015).

p %>%
  ggplotly(layerData = 2, originalData = F) %>%
  add_fun(function(p) {
    p %>% slice(which.max(se)) %>%
      add_segments(x = ~x, xend = ~x, y = ~ymin, yend = ~ymax) %>%
      add_annotations("Maximum uncertainty", ax = 60)
  }) %>%
  add_fun(function(p) {
    p %>% slice(which.min(se)) %>%
      add_segments(x = ~x, xend = ~x, y = ~ymin, yend = ~ymax) %>%
      add_annotations("Minimum uncertainty")

Figure 1.10: Leveraging data associated with a geom_smooth() layer to display additional information about the model fit.

In addition to leveraging output from StatSmooth, it is sometimes useful to leverage output of other statistics, especially for annotation purposes. Figure 1.11 leverages the output of StatBin to add annotations to a stacked bar chart. Annotation is primarily helpful for displaying the heights of bars in a stacked bar chart, since decoding the heights of bars is a fairly difficult perceptual task (Cleveland and McGill 1984). As result, it is much easier to compare bar heights representing the proportion of diamonds with a given clarity across various diamond cuts.

p <- ggplot(diamonds, aes(cut, fill = clarity)) +
  geom_bar(position = "fill")

ggplotly(p, originalData = FALSE) %>%
  mutate(ydiff = ymax - ymin) %>% 
    x = ~x, y = ~1 - (ymin + ymax) / 2,
    text = ~ifelse(ydiff > 0.02, round(ydiff, 2), ""),
    showlegend = FALSE, hoverinfo = "none",
    color = I("black"), size = I(9)

Figure 1.11: Leveraging output from StatBin to add annotations to a stacked bar chart (created via geom_bar()) which makes it easier to compare bar heights.

Another useful application is labelling the levels of each piece/polygon output by StatDensity2d as shown in Figure 1.12. Note that, in this example, the add_text() layer takes advantage of ggplotly()’s ability to inherit aesthetics from the global mapping. Furthermore, since originalData is FALSE, it attaches the “built” aesthetics (i.e., the x/y positions after StatDensity2d has been applied to the raw data).

p <- ggplot(MASS::geyser, aes(x = waiting, y = duration)) +

ggplotly(p, originalData = FALSE) %>% 
  group_by(piece) %>%
  slice(which.min(y)) %>% 
    text = ~level, size = I(9), color = I("black"), hoverinfo = "none"

Figure 1.12: Leveraging output from StatDensity2d to add annotations to contour levels. a stacked bar chart (created via geom_bar()) which makes it easier to compare bar heights.


William S. Cleveland, Robert McGill, Marylyn E. McGill. 1988. “The Shape Parameter of a Two-Variable Graph.” Journal of the American Statistical Association 83 (402). [American Statistical Association, Taylor & Francis, Ltd.]: 289–300. http://www.jstor.org/stable/2288843.

de Jong, Jos, and Kenton Russell. 2016. Listviewer: ’Htmlwidget’ for Interactive Views of R Lists. https://github.com/timelyportfolio/listviewer.

VanderPlas, Susan, and Heike Hofmann. 2015. “Signs of the Sine Illusion—Why We Need to Care.” Journal of Computational and Graphical Statistics 24 (4): 1170–90. doi:10.1080/10618600.2014.951547.

Cleveland, William S, and Robert McGill. 1984. “Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods.” Journal of the American Statistical Association 79 (September): 531–54.

  1. It could be recreated by fitting the model via loess(), obtaining the fitted values and standard error with predict(), and feeding those results into geom_line()/geom_ribbon()/geom_text()/geom_segment(), but that process is much more onerous.