1 Overview

The R programming language provides researchers with access to a large range of fully customisable data visualisation options, which are typically not available in point-and-click software. These visualisations are not only visually appealing, but can increase transparency about the distribution of the underlying data, rather than relying on commonly used visualisations of aggregations.

In this introductory section of our course, we will provide a practical introduction to using R, particularly in how to visualise data which you will use throughout the course. First, we will explain the rationale behind using R for data visualisation using the ggplot2 package. This package will allow us to begin with common plotting outputs such as histograms and boxplots, and extend to more complex structures used within spatial data visualisation.

1.1 The ggplot2 package

There are a host of options to data visualisation in R. In this course, we will mainly use the ggplot2 package, which forms part of the larger tidyverse collection of packages which provide functions for efficient data management in R. We will also use eother packages within tidyverse in the course.

A grammar of graphics is a standardised way to describe the components of a graphic. ggplot2 uses a layered grammar of graphics, in which plots are bulit up in a series of layers. It may be helpful to think about any picture as having multiple elements that sit semi-transparently over each other. Figure \(\ref{fig:layer}\) shows the evolution of a sumple scatterplot using this layered approach. First, the plot space is built (layer 1); the variables are specified (layer 2); the type of visualisation that is desired for these variables is specified (layer 3) - in this case geom_point()is called to visualise individual data points; a second geom layer is added to include a line of best fit (layer 4); the axis labels are editied for readability (layer 5) and finally, a theme is applied to change the overall appearance of the plot (layer 6).

\label{fig:layer} Evolution of a layered plot.

Figure 1.1: Evolution of a layered plot.

Each layer is independent and individually customisable. For example, the size, colour and position of each component can be adjusted. The use of layers makes it easy to build up complex plots step-by-step, and to adapt or extend plots from existing code.

1.2 Data

In this course, we will use some datasets for analysis. You can download these using the following command: