Appendix E: Packages used
E.1 cowplot
Package Profile: cowplot
The {cowplot} package provides various features that help with creating publication-quality figures, such as a set of themes, functions to align plots and arrange them into complex compound figures, and functions that make it easy to annotate plots and or mix plots with images. The package was originally written for internal use in the Wilke lab, hence the name (Claus O. Wilke’s plot package). It has also been used extensively in the book Fundamentals of Data Visualization.
There are several packages that can be used to align plots. The most widely used ones beside {cowplot} are {egg} and {patchwork} (see Section E.9). All these packages use slightly different approaches to plot alignment, and the respective approaches have different strengths and weaknesses. If you cannot achieve your desired result with one of these packages try another one.
Most importantly, while {egg} and {patchwork} align and arrange plots at the same time, {cowplot} aligns plots independently of how they are arranged. This makes it possible to align plots and then reproduce them separately, or even overlay them on top of each other.
The {cowplot} package now provides a set of complementary themes with different features. I now believe that there isn’t one single theme that works for all figures, and therefore I recommend that you always explicitly set a theme for every plot you make.
E.2 dplyr
Package Profile: dplyr
{dplyr} is a grammar of data manipulation, providing a consistent set of verbs that help you solve the most common data manipulation challenges: - mutate() adds new variables that are functions of existing variables - select() picks variables based on their names. - filter() picks cases based on their values. - summarise() reduces multiple values down to a single summary. - arrange() changes the ordering of the rows.
These all combine naturally with group_by()
which allows you to perform any operation “by group”. You can learn more about them in vignette(“dplyr”). As well as these single-table verbs, dplyr also provides a variety of two-table verbs, which you can learn about in vignette(“two-table”). (Wickham et al., 2023)
E.3 forcats
Package Profile: forcats
- reordering factor levels
- moving specified levels to front,
- ordering by first appearance,
- reversing, and
- randomly shuffling
- tools for modifying factor levels
- collapsing rare levels into other,
- ‘anonymizing’, and
- manually ‘recoding’
E.4 ggplot2
Package Profile: ggplot2
{ggplot2} is a system for declaratively creating graphics, based on The Grammar of Graphics. You provide the data, tell {ggplot2} how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details. (Wickham, 2016)
It’s hard to succinctly describe how {ggplot2} works because it embodies a deep philosophy of visualization. However, in most cases you start with ggplot()
, supply a dataset and aesthetic mapping (with aes()
). You then add on layers (like geom_point()
or geom_histogram()
), scales (like scale_colour_brewer()
), faceting specifications (like facet_wrap()
) and coordinate systems (like coord_flip()
).
E.5 glossary
Package Profile: glossary
There is a lot of necessary jargon to learn for working with geospatial data. The goal of this glossary is to provide a lightweight solution for making glossaries in educational materials written in quarto or R Markdown. This package provides functions to link terms in text to their definitions in an external glossary file, as well as create a glossary table of all linked terms at the end of a section.
E.6 glue
Package Profile: glue
Glue offers interpreted string literals that are small, fast, and dependency-free. Glue does this by embedding R expressions in curly braces which are then evaluated and inserted into the argument string.
E.7 gridExtra
Package Profile: gridExtra
Provides a number of user-level functions to work with “grid” graphics, notably to arrange multiple grid-based plots on a page, and draw tables.
The {grid) package (= part of the R system library) provides low-level functions to create graphical objects (grobs
), and position them on a page in specific viewports. The {gtable} package introduced a higher-level layout scheme, arguably more amenable to user-level interaction. With the gridExtra::arrangeGrob()
/ gridExtra::grid.arrange()
pair of functions, {gridExtra} builds upon {gtable} to arrange multiple grobs
on a page.
E.8 gtable
Package Profile: gtable
Tools to make it easier to work with “tables” of ‘grobs’. The ‘gtable’ package defines a ‘gtable’ grob class that specifies a grid along with a list of grobs and their placement in the grid. Further the package makes it easy to manipulate and combine ‘gtable’ objects so that complex compositions can be built up sequentially.
E.9 patchwork
Package Profile: patchwork
The goal of {patchwork} is to make it ridiculously simple to combine separate ggplots
into the same graphic. As such it tries to solve the same problem as gridExtra::grid.arrange()
and cowplot::plot_grid
but using an API that incites exploration and iteration, and scales to arbitrarily complex layouts.
The {ggplot2} package provides a strong API for sequentially building up a plot, but does not concern itself with composition of multiple plots. {patchwork} is a package that expands the API to allow for arbitrarily complex composition of plots by, among others, providing mathematical operators for combining multiple plots. Other packages that try to address this need (but with a different approach) are {gridExtra} and {cowplot} (see Section E.7 and Section E.1).
Before plots can be laid out, they have to be assembled. Arguably one of patchwork’s biggest selling points is that it expands on the use of +
in ggplot2 to allow plots to be added together and composed, creating a natural extension of the {ggplot2} API.
While quite complex compositions can be achieved using +
, |
, and /
, it may be necessary to take even more control over the layout. All of this can be controlled using the patchwork::plot_layout()
function along with a couple of special placeholder objects.
{patchwork}: Personal Evaluation
In this book I am using the double colon notation instead of a library()
call. Without this call it is more difficult to use the {patchwork} package.
See Using plot arithmetic functions with ::
syntax
operator | function | effect |
---|---|---|
+ | ggplot2:::"+.gg"() |
side by side |
- | patchwork:::"-.ggplot"() |
|
| | patchwork:::"\|.ggplot"() |
|
/ | patchwork:::"/.ggplot"() |
stacked |
* | patchwork:::"*.gg"() |
|
& | patchwork:::"&.gg"() |
E.10 plotly
Package Profile: plotly
Plotly.js is a standalone Javascript data visualization library, and it also powers the Python and R modules named plotly in those respective ecosystems (referred to as Plotly.py and Plotly.R).
Plotly.js can be used to produce dozens of chart types and visualizations, including statistical charts, 3D graphs, scientific charts, SVG and tile maps, financial charts and more.
E.11 rnaturalearth
Package Profile: rnaturalearth
Facilitates mapping by making natural earth map data from https://www.naturalearthdata.com/ more easily available to R users. One low resolution example file & data can be downloaded from NaturalEarth using rnaturalearth::ne_download()
. For low (scale 110) and medium resolution (scale 50) files you would need the {rnaturalearthdata} package (Section E.12). High resolution (scale 10) files are too large to be submitted to CRAN because its size exceeds CRAN recommendations. These data are only available at the GitHub repo.
This package provides :
- access to a pre-downloaded subset of Natural Earth v4.1.0 (March 2018) vector data commonly used in world mapping
- easy subsetting by countries and regions
- functions to download other Natural Earth vector and raster data
- a simple, reproducible and sustainable workflow from Natural Earth data to {rnaturalearth} enabling updating as new versions become available
- clarification of differences in world maps classified by countries, sovereign states and map units
- consistency with Natural Earth naming conventions so that {rnaturalearth} users can use Natural Earth documentation
- data in sf or sv formats
The Natural Earth website structures vector data by scale, category and type. These determine the filenames of downloads. {rnaturalearth} uses this structure to facilitate download (like an API).
E.12 rnaturalearthdata
Package Profile: Package ‘rnaturalearthdata’
Package holds low and medium resolution data used by {rnaturalearth}. Vector map data from https://www.naturalearthdata.com/. Access functions are provided in the accompanying package {rnaturalearth}. See Section E.11.
E.13 s2
Package Profile: s2
Provides R bindings for Google’s s2 library for geometric calculations on the sphere. High-performance constructors and exporters provide high compatibility with existing spatial packages, transformers construct new geometries from existing geometries, predicates provide a means to select geometries based on spatial relationships, and accessors extract information about geometries.
E.14 sf
Package Profile: sf
Support for simple feature access, a standardized way to encode and analyze spatial vector data. Binds to ‘GDAL’ doi:10.5281/zenodo.5884351 for reading and writing data, to ‘GEOS’ doi:10.5281/zenodo.11396894 for geometrical operations, and to ‘PROJ’ doi:10.5281/zenodo.5884394 for projection conversions and datum transformations. Uses by default the {s2} package for geometry operations on geodetic (long/lat degree) coordinates.
The above description may sound cryptic, but I will not explain it here. This whole note-book is dedicated to learn the various features of the {sf} package and to use it appropriately.
E.15 skimr
Package Profile: skimr
The default summary statistics may be modified by the user as can the default formatting. Support for data frames and vectors is included, and users can implement their own skim methods for specific object types as described in a vignette. Default summaries include support for inline spark graphs. Instructions for managing these on specific operating systems are given in the Using skimr vignette and the README.
{skimr}: Personal Evaluation
At the moment I am just using the skimr::skim()
function. I believe most of the many other functions for adaption are oriented to developers. But still: I need to have a closer look to this package.
E.16 sp
Package Profile: sp
Classes and methods for spatial data; the classes document where the spatial location information resides, for 2D or 3D data. Utility functions are provided, e.g. for plotting data as maps, spatial selection, as well as methods for retrieving coordinates, for subsetting, print, summary, etc. From this version, ‘rgdal’, ‘maptools’, and ‘rgeos’ are no longer used at all, see sp evolution status for details.
{sp}: use sf instead of rgdal, rgeos and maptools
Since October 2023 is the state of transition process from sp to sf finished. In new R code and dataset you should not be bothered anymore with the {sp} package.
E.17 tidyr
Package Profile: tidyr
Tidy data is data where: - Every column is a variable. - Every row is an observation. - Every cell is a single value.
E.18 tidyselect
Package Profile: tidyselect
(There is no hexagon sticker available for {tidyselect}.)
The {tidyselect} package is the backend of functions like dplyr::select()
or dplyr::pull()
as well as several {tidyr} verbs. It allows you to create selecting verbs that are consistent with other {tidyverse} packages.
To learn about the selection syntax as a user of {dplyr} or {tidyr}, read the user-friendly ?language reference.
To learn how to implement tidyselect in your own functions, read vignette(“tidyselect”).
To learn exactly how the {tidyselect} syntax is interpreted, read the technical description in vignette(“syntax”).
E.19 tidyverse
Package Profile: tidyverse
The {tidyverse} is an opinionated collection of R packages designed for data science.
All packages share an underlying design philosophy, grammar, and data structures (Wickham et al., 2019). Read more about the philosophy and purpose: The tidy tools manifesto and Welcome to the {tidyverse}
{tidyverse}: Personal Evaluation
In this book I am not going to load {tidyverse} with all its packages. Instead I am using the <package>::<function>
format to access the commands. Explicitly mentioned the used packages with every function call helps me to learn which package is responsible for which function.