4.4 Conclusion

4.4.1 Summary

Visualizations can be good, bad, or anything in between. The success of any particular visualization depends on its ecological rationality: On the one hand, the type of graph chosen and its aesthetic features need to fit to the data that is being shown. On the other hand, the message to be conveyed and the audience that is to view and interpret the graph need to be considered.

Plotting in R

Overall, the base R plotting system is very flexible, powerful, and offers a high degree of control over plotting. But as the graphical functions of R have been developed over a long period of time, they are quite heterogeneous. The main reason for this heterogeneity is that the basic R plotting system simultaneously pursues two distinct strategies:

  1. On the one hand, there are many pre-packaged graphical commands — like hist(), barplot(), or boxplot() — that combine several aspects and provide options for quickly generating some particular type of visualization.

  2. On the other hand, there are many low-level plotting functions for designing new visualizations from scratch, or for modifying existing plots.

As the latter functions often need to be combined with the former, the combination of both strategies increases complexity and frequently confuses R novices.

An alternative to a range of different functions for creating different visualizations would be a unified system that generates many different types of visualizations from a common set of principles. Remember the Swiss knife analogy invoked in Chapter 2 on Basic R concepts and commands: Rather than using a range of specialized tools, someone could design a toolbox that provides many different functions in a systematic fashion (e.g., by sharing the same arguments and command syntax for different visualizations). Such a toolbox is provided by the ggplot2 package, which is discussed in Chapter 2 on Visualizing data of the ds4psy textbook (Neth, 2021).

Plot types

There are many different types of graphs and corresponding commands in R. In this chapter, we have learned to use base R functions for creating a few of them:

  • histograms show a variable’s distribution of values;

  • scatterplots (and some variants) show the relation between two variables;

  • bar plots show the values of one or more categorical variables;

  • etc.

See the icons in 5: Directory of visualizations (Wilke, 2019) for many additional types of plots.

Aesthetic elements

Key aesthetic elements (and corresponding arguments of base R functions) include:

  • color of various elements (col, border, bg, fg)

  • line width (lwd) and type (lty)

  • point shape (pch, see ?points for possible values)

  • size of symbols or text (cex)

For a primer on using colors in R, see Appendix D: Using colors of the ds4psy textbook (Neth, 2021).

Plot elements

Key arguments for setting properties of plot() include:

  • main for providing a plot title (as character);

  • xlab and ylab for proving axes labels (as character);

  • xlim and ylim for proving the limits of axes ranges (as a numeric vector of start and end values);

  • asp for setting the aspect ratio (as a number y/x);

  • las for setting the orientation of axis labels (as a number 0–3).

For additional parameters, see the documentations of ?plot() and ?par().

Creating plots

Scientific visualizations should typically contain the following elements:

  1. A descriptive title or caption that states what the graph is showing;

  2. axes with descriptive labels and sensible value ranges;

  3. one or more geometric objects (e.g., points, bars, lines) that depict the data in a clear fashion;

  4. informative labels or a legend that explains the mapping of geometric objects and aesthetic features to data elements (e.g., which color, line, or shape, is showing which variable for which group).

When creating a new graph, planning these four steps is a good heuristic for creating successful graphs. Due to an abundance of options, we should always aim to create the basic plot before fiddling with labels and aesthetic parameters (like colors, themes, etc.).

Conclusion

Creating good visualizations is both an art and a craft. R provides abundant tools, but using them in a successful fashion is mostly a matter of experience.

The insight that any representation can be good or bad at serving particular purposes is an important point to keep in mind beyond visualizations.

4.4.2 Resources

i2ds: Links to resources, etc.

Here are some links to general resources on visualization, not just in R.

Background information and inspiration

Books or scripts on data visualization include the landmark publications by Jacques Bertin (e.g., Bertin, 2011) and Edward R. Tufte (Tufte et al., 1990; Tufte, 2001, 2006) combine sound advice with many inspiring examples. Friendly (2008) provides a historical perspective with many beautiful examples.

More recent publications that are geared to the needs of aspiring data scientists include:

More specific resources on the principles of data visualization (with many beautiful or bizarre examples) include:

Online resources

Inspiration and tools for additional types of visualizations can be found at (from specific to general):

Plotting in base R

Here are some links to helpful resources on the base R plotting system:

Colors in R

The grDevices component of R comes with many options and tools for selecting and modifying colors:

  • Call colors() or demo("colors") in the Console to view the in-built colors of R.

For a primer on using colors in R, see Appendix D: Using colors of the ds4psy textbook (Neth, 2021).

4.4.3 Preview

Given that we have some basic knowledge on the base R graphics system, a good next step would be to check out the ggplot2 package (Wickham et al., 2020). For instance, here are two introductory chapters:

References

Bertin, J. (2011). Semiology of graphics: Diagrams, networks, maps (Vol. 1). ESRI Press.
Cairo, A. (2012). The functional art: An introduction to information graphics and visualization. New Riders.
Cairo, A. (2016). The truthful art: Data, charts, and maps for communication. New Riders.
Chang, W. (2012). R graphics cookbook: Practical recipes for visualizing data (2nd ed.). O’Reilly Media. https://r-graphics.org/
Friendly, M. (2008). A brief history of data visualization. In Handbook of data visualization (pp. 15–56). Springer.
Healy, K. (2018). Data visualization: A practical introduction. Princeton University Press. https://socviz.co/
Kabacoff, R. (2018). Data visualization with R. Quantitative Analysis Center. https://rkabacoff.github.io/datavis/
Neth, H. (2021). Data science for psychologists. Social Psychology; Decision Sciences, University of Konstanz. https://bookdown.org/hneth/ds4psy/
Peng, R. D. (2020). Exploratory Data Analysis with R. Leanpub. https://bookdown.org/rdpeng/exdata/
Tufte, E. R. (2001). The visual display of quantitative information (2nd ed.). Graphics Press.
Tufte, E. R. (2006). Beautiful evidence (Vol. 1). Graphics Press.
Tufte, E. R., Goeler, N. H., & Benson, R. (1990). Envisioning information (Vol. 126). Graphics Press.
Wickham, H., Chang, W., Henry, L., Pedersen, T. L., Takahashi, K., Wilke, C., Woo, K., Yutani, H., & Dunnington, D. (2020). ggplot2: Create elegant data visualisations using the grammar of graphics. https://CRAN.R-project.org/package=ggplot2
Wickham, H., & Grolemund, G. (2017). R for data science: Import, tidy, transform, visualize, and model data. O’Reilly Media, Inc. http://r4ds.had.co.nz
Wilke, C. O. (2019). Fundamentals of data visualization: A primer on making informative and compelling figures. O’Reilly Media. https://clauswilke.com/dataviz/
Yau, N. (2011). Visualize this: The FlowingData guide to design, visualization, and statistics. John Wiley & Sons.
Yau, N. (2013). Data points: Visualization that means something. John Wiley & Sons.