Code
```{r}
#| label: setup
#| results: hold
base::source(file = "R/helper.R")
ggplot2::theme_set(ggplot2::theme_bw())
utils::data(economics, package = "ggplot2")
```
```{r}
#| label: setup
#| results: hold
base::source(file = "R/helper.R")
ggplot2::theme_set(ggplot2::theme_bw())
utils::data(economics, package = "ggplot2")
```
This chapter works with the economics
dataset from {ggplot2}. I have loaded it with utils::data(economics, package = "ggplot2")
in the setup code chunk so that it is always available, even if I don’t run the whole file but only a specific code chunk.
R Code 3.1 : Load and inspect dataset
dplyr::glimpse(economics)
#> Rows: 574
#> Columns: 6
#> $ date <date> 1967-07-01, 1967-08-01, 1967-09-01, 1967-10-01, 1967-11-01, …
#> $ pce <dbl> 506.7, 509.8, 515.6, 512.2, 517.4, 525.1, 530.9, 533.6, 544.3…
#> $ pop <dbl> 198712, 198911, 199113, 199311, 199498, 199657, 199808, 19992…
#> $ psavert <dbl> 12.6, 12.6, 11.9, 12.9, 12.8, 11.8, 11.7, 12.3, 11.7, 12.3, 1…
#> $ uempmed <dbl> 4.5, 4.7, 4.6, 4.9, 4.7, 4.8, 5.1, 4.5, 4.1, 4.6, 4.4, 4.4, 4…
#> $ unemploy <dbl> 2944, 2945, 2958, 3143, 3066, 3018, 2878, 3001, 2877, 2709, 2…
A data frame with 574 rows and 6 variables:
A plotly.js figure contains one (or more) trace(s), and every trace has a type. The trace type scatter is great for drawing low-level geometries (e.g., points, lines, text, and polygons) and provides the foundation for many plotly::add_*()
functions (e.g., - plotly::add_markers()
, plotly::add_lines()
, plotly::add_paths()
, plotly::add_segments()
, plotly::add_ribbons()
, plotly::add_area()
, and plotly::add_polygons()
) as well as many plotly::ggplotly()
charts.
It is very instructive to display all these different low-level geometries with the examples mentioned in the R help file for plotly::add_trace()
.
Code Collection 3.1 : Examples for adding trace(s) to a {plotly} visualization
R Code 3.2 : Add markers (points) as scatter trace
plotly::plot_ly(economics, x = ~date, y = ~uempmed) |>
plotly::add_markers()
Some plotly::add_*()
functions are a specific case of a trace type. For example, plotly::add_markers()
is a scatter trace with mode of markers.
The above code could also be written as: plotly::plot_ly(economics, x = ~date, y = ~uempmed, type = "scatter", mode = "markers")
R Code 3.4 : Personal savings visualized with plotly::add_path()
plotly::add_paths()
. The figure connects observations according to the ordering of psavert (personal savings rate).
R Code 3.5 : Personal saving rates visualized with plotly::add_lines()
plotly::add_lines()
. The figure connects observations according to the ordering of x
(the date).
R Code 3.6 : Working with ploty.js directly
If you prefer to work with plotly.js more directly, you can always use plotly::add_trace()
and specify the type and mode yourself. See also code and comment for Figure 3.1.
In addition to ‘aesthetic mapping’ arguments (unique to the R package) which make it easier to map data to visual properties, {dplyr} groupings can be used to ensure there is at least one geometry per group.
Code Collection 3.2 : Generating one geometry per ‘group’
R Code 3.7 : Group data with {dplyr}
{dplyr} groupings can be used to ensure there is at least one geometry per group. Figure 3.5 demonstrates how dplyr::group_by()
could be used to effectively wrap the time series from Figure 3.1 by year, which can be useful for visualizing annual seasonality.
R Code 3.8 : Multiple traces with levels of a categorical variable
Another approach to generating at least one geometry per ‘group’ is to provide categorical variable to a relevant aesthetic (e.g., color
).
Look a video how to use the interactive features in Figure 3.6. For a bigger interactive demonstration to play around, see https://plotly-r.com/interactives/scatter-lines.html.
Comparatively speaking, Figure 3.6 has more interactive capabilities (e.g., legend-based filtering and multiple tooltips) than Figure 3.5, but it does not scale as well with many lines.
R Code 3.9 : Guaranteeing one trace per group level
The split argument guarantees one trace per group level (regardless of the variable type). This is useful if you want a consistent visual property over multiple traces.
In the case of Figure 3.5, the benefit of having multiple traces is that we can perform interactive filtering via the legend and compare multiple y-values at a given x. The cost of having those capabilities is that plots starts to be become sluggish after a few hundred traces, whereas thousands of lines can be rendered fairly easily in one trace.
Mapping data to visual properties make it easier to get started using plotly.js, but it still pays off to learn how to use plotly.js directly. You won’t find plotly.js attributes listed as explicit arguments in any plotly function (except for the special type attribute), but they are passed along verbatim to the plotly.js figure definition through the ...
operator.
The scatter-based layers in this chapter fix the type plotly.js attribute to “scatter” as well as the mode (e.g., plotly::add_markers()
uses mode = 'markers'
etc), but you could also use the lower-level plotly::add_trace()
to work more directly with plotly.js.
For example, Figure 3.3 shows how to render markers, lines, and text in the same scatter trace. It also demonstrates how to leverage nested plotly.js attributes, like textfont
and xaxis
– these attributes contain other attributes, so you need to supply a suitable named list to these arguments.
R Code 3.10 : Render different modes (markers, lines and text) in the same scatter trace
base::set.seed(99)
plotly::plot_ly() |>
plotly::add_trace(
type = "scatter",
mode = "markers+lines+text",
x = 4:6,
y = 4:6,
text = base::replicate(3,
praise::praise("You are ${adjective}! 🙌")),
textposition = "right",
hoverinfo = "text",
textfont = base::list(family = "Roboto Condensed", size = 16)
) |>
plotly::layout(xaxis = base::list(range = c(3, 8)))
plotly::add_trace()
function to render markers, lines, and text in a single scatter trace.
plotly::add_trace()
function, as well as any plotly::add_*()
function allows you to directly specify plotly.js attributes.
Resource 3.1 : plotly.js attributes
plotly::schema()
: It provides more information than the online docs (e.g., value types, default values, acceptable ranges, etc). It matches the version used in the R package and the interface makes it easier to traverse and discover new attributes. See R Code 2.14.For the following examples it is helpful for me to compare plotly and ggplot2 with their easiest syntax for a graph.
R Code 3.11 : Compare the simplest graphics creaty by plotly and ggplot2
p1 <- plotly::plot_ly(ggplot2::mpg, x = ~cty, y = ~hwy) |>
plotly::add_markers()
p2 <- ggplot2::ggplot(ggplot2::mpg, ggplot2::aes(x = cty, y = hwy)) +
ggplot2::geom_point()
plotly::subplot(p1, p2)
The syntax of plotly and ggplot2 are similar:
ggplot2::aes()
without formula.+
plotly::add_markers()
corresponds to ggplot2::geom_point()
ggplot2::theme_set(ggplot2::theme_bw())
in the setup chunk for ggplot2 figures. To get rid of the frame I would use: ggplot2::theme(panel.border = ggplot2::element_blank())
.In this code chunk I suppressed the warning Can only have one: config
after using plotly::subplot()
. Other functions for plotting the two graphs created by different packages did not succeed. The problem is — as I understood — that they produce different object types:
class(p1)
= plotly, htmlwidgetclass(p2)
= gg, ggplotplotly::subplot()
that both graphics are interactive!
This section details scatter traces with a mode of “markers” (i.e., plotly::add_markers()
). For simplicity, many of the examples here use plotly::add_markers()
with a numeric x and y axis, which results in scatterplot – a common way to visualize the association between two quantitative variables. The content that follows is still relevant markers displayed non-numeric x and y (aka dot pots) as shown in Section 3.2.7.
Overplotting, a common problem with scatterplots, can be combatted via alpha blending.
R Code 3.12 : Combatting overplotting with in a scatterplot with alpha blending
plotly::subplot(
plotly::plot_ly(
ggplot2::mpg,
x = ~cty,
y = ~hwy,
name = "default") |>
plotly::add_markers(),
plotly::plot_ly(
ggplot2::mpg,
x = ~cty,
y = ~hwy) |>
plotly::add_markers(
alpha = 0.2,
name = "alpha"
)
)
Mapping a discrete variable to color produces one trace per category (see: Section 2.2), which is desirable for it’s legend and hover properties.
On the other hand, mapping a numeric variable to color produces one trace, as well as a colorbar guide for visually decoding colors back to data values. As you can see in the online documentation there are many attributes for colorbar. But it is better to explore these attributes by navigating plotly::schema()
by opening the following hierarchical folders: scatter->attributes->marker->colorbar.
The plotly::colorbar()
function can be used to customize the appearance of this automatically generated guide. The default colorscale is viridis
, a perceptually-uniform colorscale (even when converted to black-and-white), and perceivable even to those with common forms of color blindness. Viridis is also the default colorscale for ordered factors.
R Code 3.13 : Variations of numeric color mapping
p <- plotly::plot_ly(
ggplot2::mpg,
x = ~cty,
y = ~hwy,
alpha = 0.5
)
p1 <- plotly::add_markers(
p,
color = ~cyl,
showlegend = FALSE
) |>
plotly::colorbar(title = "Viridis")
p2 <- plotly::add_markers(
p,
color = ~base::factor(cyl)
)
plotly::subplot(p1, p2)
There are numerous ways to alter the default color scale via the colors argument. This argument excepts one of the following:
RColorBrewer::brewer.pal.info
for valid names and have a look at the RColorBrewer color palettes),grDevices::colorRamp()
or scales::colour_ramp()
.Although this grants a lot of flexibility, one should be conscious of using a sequential colorscale for numeric variables (& ordered factors) as shown in Figure 3.7, and a qualitative colorscale for discrete variables as shown in Figure 3.8.
Code Collection 3.3 : Color scales for numeric and discrete variables
R Code 3.14 : Color scale for numeric variables
col1 <- c("#132B43", "#56B1F7")
col2 <- viridisLite::inferno(10)
col3 <- grDevices::colorRamp(base::c("red", "white", "blue"))
plotly::subplot(
plotly::add_markers(p, color = ~cyl, colors = col1) |>
plotly::colorbar(title = "ggplot2 default"),
plotly::add_markers(p, color = ~cyl, colors = col2) |>
plotly::colorbar(title = "Inferno"),
plotly::add_markers(p, color = ~cyl, colors = col3) |>
plotly::colorbar(title = "colorRamp")
) |>
plotly::hide_legend()
R Code 3.15 : Colorscale for discrete variables
col1 <- "Accent"
col2 <- grDevices::colorRamp(base::c("red", "blue"))
col3 <- c(`4` = "red", `5` = "black", `6` = "blue", `8` = "green")
plotly::subplot(
plotly::add_markers(p, color = ~base::factor(cyl), colors = col1),
plotly::add_markers(p, color = ~base::factor(cyl), colors = col2),
plotly::add_markers(p, color = ~base::factor(cyl), colors = col3)
) |>
plotly::hide_legend()
As introduced in Figure 2.5, color codes can be specified manually (i.e., avoid mapping data values to a visual range) by using the base::I()
function. Figure 3.9 provides a simple example using plotly::add_markers()
. Any color understood by the grDevices::col2rgb()
function can be used in this way.
R Code 3.16 : Specify color manually
plotly::add_markers(p, color = base::I("red"))
The color argument is meant to control the ‘fill-color’ of a geometric object, whereas stroke (Section 3.2.5) is meant to control the ‘outline-color’ of a geometric object. In the case of plotly::add_markers()
, that means color maps to the plotly.js attribute marker.color
(scatter->attributes->marker->color) and stroke maps to marker.line.color
(scatter->attributes->marker->line->color). Not all, but many, marker symbols have a notion of stroke.
The symbol
argument can be used to map data values to the marker.symbol
plotly.js attribute. It uses the same semantics that we’ve already seen for color:
symbols
, can be used to specify the visual range for the mapping.base::I()
.R Code 3.17 : Map data values with symbols
p <- plotly::plot_ly(ggplot2::mpg, x = ~cty, y = ~hwy, alpha = 0.3)
plotly::subplot(
plotly::add_markers(p, symbol = ~cyl, name = "A single trace"),
plotly::add_markers(p, symbol = ~base::factor(cyl), color = base::I("black"))
)
The left panel of Figure 3.15 uses a numeric mapping and the right panel uses a discrete mapping. As a result, the left panel is linked to the first legend entry (“A single trace”), whereas the right panel is linked to the bottom four legend entries.
The text in the book says three legend entries, but there are four different symbols for the different number of cylinders 4, 5, 6, 8.
When plotting multiple traces and no color is specified, the plotly.js colorway
is applied (i.e., each trace will be rendered a different color). To set a fixed color, you can set the color of every trace generated from this layer with color = base::I("black")
, or similar.
There are two ways to specify the visual range of symbols:
marker.symbol
value.R Code 3.18 : Ways to specify the visual range of symbols
plotly::subplot(
plotly::add_markers(p,
symbol = ~cyl,
symbols = c(17, 18, 19, 43)
),
plotly::add_markers(
p, color = base::I("black"),
symbol = ~base::factor(cyl),
symbols = c("triangle-up", "diamond", "circle", "cross-thin-open")
)
)
Figure 3.16 uses pch
codes (left panel) as well as their corresponding marker.symbol
name (right panel) to specify the visual range. pch
stands for plotting character. For more details see plotly::schema(F)$traces$scatter$attributes$marker$symbol$values
. To see all the symbols available to plotly, as well as a method for supplying your own custom glyphs, see Chapter 28 of the plotly book: Working with symbols and glyphs.
As with colors, these symbols (i.e., the visual range) can also be supplied directly to symbol
through base::I()
.
R Code 3.19 : Map symbols directly through base::I()
plotly::plot_ly(ggplot2::mpg, x = ~cty, y = ~hwy) |>
plotly::add_markers(symbol = base::I(18), alpha = 0.5)
The stroke
argument follows the same semantics as color
and symbol
when it comes to variable mappings and specifying visual ranges. Typically you don’t want to map data values to stroke
, you just want to specify a fixed outline color. By default, the span
, or width of the stroke, is zero, you’ll likely want to set the width to be around one pixel.
stroke
and span
?
The corresponding {ggplot} arguments are color
, fill
, size
and stroke
: The size of the filled part is controlled by size
, the size of the stroke (aka span
in Plotly) is controlled by stroke
. Each is measured in mm (and not in pixels).
R Code 3.20 : Mapping symbol
, stroke
and span
directly
For scatterplots, the size
argument controls the area of markers (unless otherwise specified via sizemode), and must be a numeric variable. The sizes
argument controls the minimum and maximum size of circles, in pixels.
R Code 3.21 : Map size
and sizes
p <- plotly::plot_ly(
ggplot2::mpg, x = ~cty,
y = ~hwy,
alpha = 0.3,
fill = ~'' # added this line to suppress warning
)
plotly::subplot(
plotly::add_markers(p, size = ~cyl, name = "default"),
plotly::add_markers(p, size = ~cyl, sizes = c(1, 500), name = "custom")
)
line.width
does not currently support multiple values.’
To prevent the above warning message I had to add the line fill = ~''
. For details see StackOverflow.
Similar to other arguments, base::I()
can be used to specify the size directly. In the case of markers, size controls the marker.size
plotly.js attribute. Remember, you always have the option to set this attribute directly by doing something similar to Figure 3.20.
R Code 3.22 : Set size directly
plotly::plot_ly(
ggplot2::mpg,
x = ~cty,
y = ~hwy,
alpha = 0.3,
size = base::I(100)
) |>
plotly::add_markers(type = "scatter") ## added to prevent warnings
Using the code from the book I received the following warning message twice:
No trace type specified:
Based on info supplied, a ‘scatter’ trace seems appropriate.
Read more about this trace type -> https://plotly.com/r/reference/#scatter
No scatter mode specifed:
Setting the mode to markers
Read more about this attribute -> https://plotly.com/r/reference/#scatter-mode
To prevent this warning I had to add another layer plotly::add_markers(type = "scatter")
.
The plotly::add_polygons()
function is essentially equivalent to plotly::add_paths()
with the fill
attribute set to “toself”. Polygons form the basis for other, higher-level scatter-based layers (e.g., plotly::add_ribbons()
and plotly::add_sf()
) that don’t have a dedicated plotly.js trace type. Polygons can be use to draw many things, but perhaps the most familiar application where you might want to use plotly::add_polygons()
is to draw geo-spatial objects.
If and when you use plotly::add_polygons()
to draw a map, make sure you fix the aspect ratio (e.g., xaxis.scaleanchor
) and also consider using plotly::plotly_empty()
over plotly::plot_ly()
to hide axis labels, ticks, and the background grid. On the other hand, Section Section 4.2 shows you how to make a custom maps using the {sf} package and plotly::add_sf()
, which is a bit of work to get started, but is absolutely worth the investment.
R Code 3.23 : Drawing Maps Using {maps}
base_map <- ggplot2::map_data("world", "canada") |>
dplyr::group_by(group) |>
plotly::plotly_empty(x = ~long, y = ~lat, alpha = 0.2) |>
plotly::layout(showlegend = FALSE, xaxis = base::list(scaleanchor = "y"))
base_map |>
plotly::add_polygons(hoverinfo = "none", color = base::I("black")) |>
plotly::add_markers(
text = ~base::paste(name, "<br />", pop),
hoverinfo = "text",
color = base::I("red"),
data = maps::canada.cities)
plotly::add_polygons()
to make a map of Canada and major Canadian cities via data provided by the {maps} package.
ggplot2::map_data()
is a function to turn data from the {maps} package into a data frame suitable for plotting with {ggplot2}.