Visualizations can be good, bad, or anything in between. The success of any particular visualization depends on its ecological rationality: On the one hand, the type of graph chosen and its aesthetic features need to fit to the data that is being shown. On the other hand, the message to be conveyed and the audience that is to view and interpret the graph need to be considered.
There are many different types of graphs and corresponding commands in R. In this chapter, we have learned to use base R functions for creating a few of them:
histograms show a variable’s distribution of values;
scatterplots (and some variants) show the relation between two variables;
bar plots show the values of one or more categorical variables;
Key aesthetic elements (and corresponding arguments of base R functions) include:
color of various elements (
line width (
lwd) and type (
point shape (
?pointsfor possible values)
size of symbols or text (
Key arguments for setting properties of
mainfor providing a plot title (as character);
ylabfor proving axes labels (as character);
ylimfor proving the limits of axes ranges (as a numeric vector of start and end values);
aspfor setting the aspect ratio (as a number y/x);
lasfor setting the orientation of axis labels (as a number 0–3).
For additional parameters, see the documentations of
Scientific visualizations should typically contain the following elements:
A descriptive title or caption that states what the graph is showing;
axes with descriptive labels and sensible value ranges;
one or more geometric objects (e.g., points, bars, lines) that depict the data in a clear fashion;
informative labels or a legend that explains the mapping of geometric objects and aesthetic features to data elements (e.g., which color, line, or shape, is showing which variable for which group).
When creating a new graph, planning these four steps is a good heuristic for creating successful graphs. Due to an abundance of options, we should always aim to create the basic plot before fiddling with labels and aesthetic parameters (like colors, themes, etc.).
Creating good visualizations is both an art and a craft. R provides abundant tools, but using them in a successful fashion is mostly a matter of experience.
The insight that any representation can be good or bad at serving particular purposes is an important point to keep in mind beyond visualizations.
Here are some links to general resources on visualization, not just in R.
Background information and inspiration
Books or scripts on data visualization include the landmark publications by Jacques Bertin (e.g., Bertin, 2011) and Edward R. Tufte (Tufte, 2001, 2006; Tufte, Goeler, & Benson, 1990) combine sound advice with many inspiring examples. Friendly (2008) provides a historical perspective with many beautiful examples.
More recent publications that are geared to the needs of aspiring data scientists include:
More specific resources on the principles of data visualization (with many beautiful or bizarre examples) include:
Data visualization principles (by Rafael A. Irizarry)
Data visualization: Basic principles (by Peter Aldhous)
Inspiration and tools for additional types of visualizations can be found at (from specific to general):
Plotting in base R
Here are some links to helpful resources on the base R plotting system:
Given that we have some basic knowledge on the base R graphics system, a good next step would be to check out the ggplot2 package (Wickham et al., 2020). For instance, here are two introductory chapters: