10 R Documentation

This chapter discusses how to navigate R documentations to help us better understand the functions and syntax of R.

10.1 Add-on packages

In R, add-on packages are packages that are not included in the default installation of R 14, but can be installed separately to extend the functionality of the language. These packages are typically created by third-party developers, and are hosted on the Comprehensive R Archive Network (CRAN) or other repositories.

For instance, tidyquant is a tidyverse package that helps us collect and analyze financial data in R. You’ll also find people using the R package quantmod for the same array of tasks. Neither tidyquant nor quantmod comes with the base R. They are add-on packages that R users have developed to accomplish very specific tasks.

The question is: How are the two packages tidyquant and quantmod different from each other? Which one should I use? How would I know if this is a “good” package or not? What is tidyverse? etc. etc. Besides, after we have selected a package, navigating its documentation could be daunting at first. Below we offer some tips on navigating R documentation to find and use add-on packages.

10.3 Finding and using add-on packages

1. Which package should I use?

  1. Look for what your community members use. You may also find out how popular a package is, in general, on the website RDocumentation.

  2. Make sure the package is being actively maintained. One indicator is the last updated date. We can find this date on the first page of its reference manual, or check its GitHub repository.

For instance, there are more than three R packages that can be used to collect data from Twitter, including rtweet, twitteR and streamR. The last updated dates for the three packages are in 2023, 2022 and 2018, respectively. The last one is not being actively maintained.

Whether a package is actively maintained matters a lot, especially considering the fact that web services such as Twitter has been actively updating its policies for data collection via its APIs. Therefore, the R wrappers for Twitter APIs better keep up with Twitter’s pace.

2. Which function should I use?

Make sure you read the documentation of the function, pay attention to its technical details, and understand what it does, at least intuitively.

For instance, there are several packages that specialize in text analysis. A technique often used in text analysis is sentiment analysis, and more specifically, a metric called polarity score. It is available in several packages, including sentimentr and quanteda, which are both popular among R users for textual analysis.

Then, which package and which function should we rely on? Before we can make a decision, we need to think about the equations and dictionaries that the polarity score functions in these two packages use. Do they use the same equations? What dictionaries are they built upon? Do they use the same dictionaries? Can they solve the problem at hand?

For functions that generate the polarity scores, they are documented in the help files for Quanteda and sentimentr.

3. Package specific object classes

When calling functions from an add-on package, we often get returned objects specific to that package. For instance, quantmod return xts or zoo objects when retrieving Yahoo Finance data, but tidyquant return data in “tidy” forms, such as tbl_df and tbl. However, underneath these peculiar names, ultimately these objects are R data structures. We can access and manipulate these objects with the methods that we have learnt before in the section “data structures”.

4. What is tidyverse?

If you use R, you probably will bump into tidyverse. Even if you don’t use it, or dislike it for some reasons, you may have collaborators who use tidyverse.

tidyverse packages operate on “tidy data”. Tidy data has a specific structure: each variable is a column; each observation is a row; and each type of observational unit is a table. That is straightforward.

If you know Marvel universe, you can see what tidyverse is. tidyverse is a collection of R packages that share an underlying design philosophy, grammar, and data structures. It is designed for data science.

Alternatively, you can think of tidyverse as a dialect of R, and it is certainly not the only dialect in R.


  1. For a complete list of base functions, use library(help = "base").↩︎