Chapter 3 Package

3.1 Introduction

Packages in R are collections of R functions, data, and compiled code bundled together. These packages are designed to solve specific tasks or problems, ranging from data manipulation to machine learning and visualization.

Key features of R packages: - Provide reusable functions. - Include documentation and examples. - Enable sharing of community-developed tools.

Commonly used packages include:

  • ggplot2: Advanced data visualization.

  • dplyr: Data manipulation.

  • caret: Machine learning.

  • shiny: Interactive web applications.

3.2 R Package Repositories

R packages are typically hosted on repositories, the most notable being: - CRAN (Comprehensive R Archive Network): The primary and most trusted source of R packages. - Bioconductor: Specialized in bioinformatics and computational biology packages. - GitHub: A popular platform for sharing development versions of R packages.

3.2.1 Choosing the Right Repository

For most use cases, CRAN is the default choice, as it ensures high-quality and well-documented packages. For cutting-edge or experimental packages, GitHub is a great option.

3.3 Installing Packages from CRAN

The most straightforward way to install packages is by using the install.packages() function.

3.3.1 Basic Syntax

install.packages("package_name")

3.3.2 Example

To install the ggplot2 package:

install.packages("ggplot2")

3.3.3 Installing Multiple Packages

You can install multiple packages at once by providing a vector of package names:

install.packages(c("dplyr", "tidyr", "stringr"))

3.3.4 Choosing a CRAN Mirror

When prompted, select a CRAN mirror close to your geographic location for faster downloads. Alternatively, specify it in the repos argument:

install.packages("ggplot2", repos = "https://cloud.r-project.org/")

3.4 Installing Packages from Bioconductor

Bioconductor packages require the installation of the BiocManager package.

3.4.1 Steps:

  1. Install BiocManager:

    install.packages("BiocManager")
  2. Use BiocManager to install Bioconductor packages:

    BiocManager::install("GenomicFeatures")

3.5 Installing Packages from GitHub

GitHub hosts many experimental and development versions of R packages. Use the devtools package to install packages from GitHub.

3.5.1 Steps:

  1. Install the devtools package:

    install.packages("devtools")
  2. Install a package from GitHub:

    devtools::install_github("username/repository")

3.5.2 Example

To install the tidyverse development version:

devtools::install_github("tidyverse/ggplot2")

3.6 Installing Packages from Archive

In some cases, you may need to install an older version of a package, either due to compatibility issues or specific project requirements. These versions are available in the CRAN package archive.

3.6.1 Accessing the Archive

CRAN maintains a package archive where you can find older versions of packages. The archive is accessible at:
https://cran.r-project.org/src/contrib/Archive/

Each package’s subdirectory contains previous versions in .tar.gz format.

3.6.2 Installing from the Archive

3.6.2.1 Step 1: Download the Package Source

  1. Visit the package’s archive page on CRAN.
  2. Download the desired version as a .tar.gz file.

3.6.2.2 Step 2: Install the Downloaded Package

Use the install.packages() function to install the package locally from the downloaded file.

3.6.2.2.1 Syntax:
install.packages("path/to/package.tar.gz", repos = NULL, type = "source")
3.6.2.2.2 Example:

If the file dplyr_1.0.5.tar.gz is saved in your Downloads folder:

install.packages("~/Downloads/dplyr_1.0.5.tar.gz", repos = NULL, type = "source")

3.6.2.3 Specifying a Version (Alternative Method)

With the devtools package, you can directly install a specific version of a package from the CRAN archive:

devtools::install_version("dplyr", version = "1.0.5", repos = "http://cran.us.r-project.org")

3.7 Managing Installed Packages

Once installed, you may need to update or remove packages.

3.7.1 Loading Installed Packages

Use library() to load a package into the current session:

library(ggplot2)

3.7.2 Checking Installed Packages

List all installed packages:

installed.packages()

3.7.3 Updating Packages

Update all packages:

update.packages()

Update a specific package:

install.packages("package_name")

3.7.4 Removing Packages

Remove an installed package:

remove.packages("package_name")

3.8 Troubleshooting Installation Issues

Package installation can sometimes fail. Here are common issues and solutions:

3.8.1 Missing Dependencies

Packages often rely on other packages, called dependencies. Ensure they are installed by letting R handle them automatically.

3.8.2 Internet Connectivity Issues

Ensure your internet connection is stable. For corporate environments, check firewall settings.

3.8.3 Outdated R Version

Update R to the latest version if a package requires newer features.

3.8.4 Compatibility with Operating System

Check the package’s documentation for OS-specific instructions.

3.9 Summary

Package installation in R is a vital skill that enables you to leverage the full power of the R ecosystem. By understanding repositories, installation methods, and troubleshooting techniques, you can streamline your workflow and focus on analysis and development.

In the next chapter, we will explore how to create your own R package, empowering you to contribute to the R community.