Chapter 3 Using R

3.1 Introduction

In addition to there being many resources available for using R to solve statistical and data science challenges, there are also many resources on how to maximize your effectiveness using R.

This chapter compiles what I consider to be the essential texts; articles and blog posts will be at a minimum.


3.2 The R community

The R Consortium " is a group organized under an open source governance and foundation model to support the worldwide community of users, maintainers and developers of R software. … The central mission of the R Consortium is to work with and provide support to the R Foundation and key organizations and groups developing, maintaining, distributing and using R software."

The R Foundation

is a not for profit organization working in the public interest. It has been founded by the members of the R Development Core Team in order to * Provide support for the R project and other innovations in statistical computing. We believe that R has become a mature and valuable tool and we would like to ensure its continued development and the development of future innovations in software for statistical and computational research. * Provide a reference point for individuals, instititutions or commercial enterprises that want to support or interact with the R development community. * Hold and administer the copyright of R software and documentation.

Martyn Plummer, The R Consortium and the R Foundation, The R Journal Vol. 7/2, December 2015

David Keyes, 2019-07-29, If You Care About Equity, Use R

Julia Stewart Lowndes, 2019-12-10, Open Software Means Kinder Science, Scientific American blog


3.3 General & all-encompassing resources

David Smale, Free R Reading Material, A Shiny app collection “of books about the R programming language and Data Science, that you can read for free!”

Jennifer Bryan and Jim Hester, What They Forgot to Teach You About R – “designed for experienced R and RStudio users who want to (re)design their R lifestyle. We focus on building holistic and project-oriented workflows that address the most common sources of friction in data analysis, outside of doing the statistical analysis itself.”

Colin Fay, 2018-09-24, Why do we use arrow as an assignment operator? – answering the second-most-asked question about R.

3.3.1 Journals

The R Journal – " the open access, refereed journal of the R project for statistical computing. It features short to medium length articles covering topics that should be of interest to users or developers of R."

Journal of Statistical Software – not limited to R, the journal “publishes articles, book reviews, code snippets, and software reviews on the subject of statistical software and algorithms.”


3.4 R Release Names

The most-asked question about R?

Answer: “All of the release names are references to Peanuts strips/films.”

Lucy D’Agostino McGowan, 2017-07-28, R release names

Releases since 2017-07-28:

(Another source for the strips is the peanuts.fandom.com comics archive)


3.5 R Introductions

Nathaniel D. Phillips, YaRrr! The Pirate’s Guide to R


3.6 The R toolbox

3.6.1 CRAN

The Comprehensive R Archive Network – CRAN for short – “is a network of ftp and web servers around the world that store identical, up-to-date, versions of code and documentation for R.” This is the place to get your installation of R, and the production versions of your favourite packages.

3.6.2 RStudio – the IDE

“RStudio” can mean one of two things:

3.6.3 Packages

Antoine Bichat’s favoriteRpackages


3.7 R as a programming environment

Colin Gillespie and Robin Lovelace, Efficient R Programming (Gillespie and Lovelace 2017)

Hadley Wickham, Advanced R (Wickham 2015a)

Hadley Wickham, R Packages (Wickham 2015b)

3.7.1 Debugging R

Hadley Wickham, “Debugging”, Chapter 22 in Advanced R (Wickham 2015a)

Jonathan McPherson, 2019-05-20, Debugging with RStudio

Jennifer Bryan and Jim Hester, “Debugging R code”, Chapter 11 in What They Forgot to Teach You About R

Jenny Bryan (2020-01-30) Object of type ‘closure’ is not subsettable—talk at rstudio::conf 2020

3.7.2 R as part of the Data Science Toolbox

Jessica Minnier, 2019-07-29, Sharpening the Tools in Your Data Science Toolbox

3.7.3 The R-Python interface

3.7.3.1 {feather}

{feather} is designed to read and write feather files, a lightweight binary columnar data store designed for maximum speed. There is a parallel Python package.

3.7.3.2 {reticulate}

The reticulate package provides a comprehensive set of tools for interoperability between Python and R.

3.7.3.3 {rpy2}

JD Long, twitter thread on {rpy2} documentation: twitter thread

-30-

3.8 Introduction

3.9 Understand your data

3.10 Bias

References

Gillespie, Colin, and Robin Lovelace. 2017. Efficient R Programming: A Practical Guide to Smarter Programming. O’Reilly. https://csgillespie.github.io/efficientR/.

Wickham, Hadley. 2015a. Advanced R. CRC Press. https://adv-r.hadley.nz/.

Wickham, Hadley. 2015b. R Packages: Organize, Test, Document, and Share Your Code. O’Reilly. http://r-pkgs.had.co.nz/.