Chapter 2 R Scripts and R Packages

2.1 Objectives

In this chapter, we would like to users

  • to write simple R scripts
  • to understand R packages
  • to install R packages
  • to create a new RStudio project
  • to be able to use RStudio Cloud

2.2 Introduction

An R script is simply a text file containing (almost) the same commands that you would enter on the command line of R. ( almost) refers to the fact that if you are using sink() to send the output to a file, you will have to enclose some commands in print() to get the same output as on the command line.

R packages are extensions to the R statistical programming language. R packages contain code, data, and documentation in a standardised collection format that can be installed by users of R, typically via a centralised software repository such as CRAN (the Comprehensive R Archive Network).

2.3 Open a new R script

For beginner, you may start by writing some simple codes. Since these codes are written in R language, we call these codes as R scripts. To do this, go to File, then click R Script

  • File -> R Script
  • In Window OS, users can use this shortcut CTRL-SHIFT-N

New R Script

2.3.1 Our first R script

Let us write our very first R codes inside an R script.

  • In Line 1, type 2 + 3
  • click CTRL-ENTER or CMD-ENTER
  • see the outputs in the Console Pane
2 + 3
## [1] 5

After writing your codes inside the R script, you can save the R script file. This will allow you to open it up again to continue your work.

And to save R script, go to

  • File ->
  • Save As ->
  • Choose folder ->
  • Name the file

Now, types these codes to check the version of your R software

version[6:7]
##        _
## status  
## major  4

The current version for R is 4.2.1

By they way if you are using lower version of R, then we recommend you to upgrade. To upgrade your R software

  • and if you are using Windows, you can use installr package
  • but if you use macOS, you may need to download R again and manually install

You may find more information from this link.

2.3.2 Function, Argument and Parameters

R codes contain

  • function
  • argument
  • parameters
f <- function(<arguments>) {
## Do something interesting
}

For example, to list all the arguments for a function, you may use args(). Let’s examine the arguments for the function lm(), a function to estimate parameters for linear regression model.

args(lm)
## function (formula, data, subset, weights, na.action, method = "qr", 
##     model = TRUE, x = FALSE, y = FALSE, qr = TRUE, singular.ok = TRUE, 
##     contrasts = NULL, offset, ...) 
## NULL

Once you understand the required arguments, you may use the parameters so the function can perform the desired task. For example:

lm(weight ~ Time, data = ChickWeight)
## 
## Call:
## lm(formula = weight ~ Time, data = ChickWeight)
## 
## Coefficients:
## (Intercept)         Time  
##      27.467        8.803

2.3.3 If users requires further help

If users would like to see more extensive guides on certain function, they may type the \(?\) before the function. For example, users want to know more about the function lm, then he may type the R codes below. Following that, R will open a help page with more detailed description, usage of the function and the relevant arguments.

?lm
## starting httpd help server ... done

Here, we provide an example how a Help Pane will look like.

Help Pane

2.4 Packages

R is a programming language. Furthermore, R software runs on packages. R packages are collections of functions and data sets developed by the community. They increase the power of R by improving existing base R codes and functions or by adding new ones.

A package is a suitable way to organize users’ work and share it with others if users want to. Typically, a package will include

  • code (sometimes not just R codes but codes in other programming languages),
  • documentation for the package and the functions inside,
  • some tests to check that everything works as it should, and
  • data sets.

Users can read more about R packages here.

2.4.1 Packages on CRAN

At the time of writing, the CRAN package repository features 12784 packages. Available R packages are listed on the Cran Task Views website.

CRAN Task Views

CRAN task views aim to provide some guidance which packages on CRAN are relevant for tasks related to a certain topic. They give a brief overview of the included packages and can be automatically installed using the ctv package.

The views are intended to have a sharp focus so that it is sufficiently clear which packages should be included (or excluded) and they are not meant to endorse the “best” packages for a given task.

2.4.2 Checking availability of R package

To check if the desired package is available on users’ machine, users can this inside their R console:

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.4.0      ✔ purrr   0.3.5 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.3      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

Users should not receive any error messages. Users who have not installed the package will receive an error message. Furthermore, it tells them that the package is not available in their R. By default, the package is stored in the R folder in their My Document or HOME directory

.libPaths()
## [1] "C:/Users/drkim/AppData/Local/R/win-library/4.2"
## [2] "C:/Program Files/R/R-4.2.2/library"

2.4.3 Install an R package

To install an R package, there are two ways:

  1. users can type the R codes like below (without the # tag)
# install.packages(tidyverse, dependencies = TRUE)
  1. users can use the GUI in the RStudio IDE

Install Packages pane

Now, type the package you want to install. For example you want to install the tidyverse package

The name of the package to be installed

And then click the Install button. And you need to have internet access to do this. You can also install packages from:

  • a zip file (from your machine or USB),
  • from github repository
  • other repository

2.5 Working directory

Setting and knowing the R working directory is very important. Our working directory will contain the R codes, the R outputs, datasets or even resources or tutorials that can help us during in R project or during our R analysis/

The working directory is just a folder. Moreover, the folder can contain many sub-folders. We recommend that the folder contain the dataset (if you want to analyze your data locally) and other R objects. R will store many other R objects created during each R session.

Type this to locate the working directory:

getwd()
## [1] "C:/Users/drkim/OneDrive - Universiti Sains Malaysia/multivar_data_analysis"

2.5.1 Starting a new R job

There are two ways to start a new R job:

  • create a new R project from RStudio IDE. This is the method that we recommend.
  • setting your working directory using the setwd() function.

2.5.2 Creating a new R project

We highly encourage users to create a new R project. To do this users can

  • go to File -> New Project

New Project

When you see project type, click New Project

Project Type

2.5.3 Location for dataset

Many data analysts use data stored on their local machines. R will read data and usually store this data in data frame format or class. When you read your data into RStudio, you will see the dataset in the environment pane. RStudio reads the original dataset and saves it to the RAM (random access memory). So you must know the size of your computer RAM. How much your RAM for your machine? The bigger the RAM, the larger R can read and store your data in the computer’s memory.

The data read (in memory) will disappear once you close RStudio. But the source dataset will stay in its original location, so there will be no change to your original data (be happy!) unless you save the data frame in the memory and replace the original file. However, we do not recommend you do this.

Environment pane lists data in the memory

2.6 Upload data to RStudio Cloud

If users want to use data in the RStuio Cloud, they may have to upload the data to the RStudio Cloud directory. They may also use RStudio Cloud to read data from the Dropbox folder or Google Drive folder.

Upload tab on RStudio Cloud

2.7 More resources on RStudio Cloud

There are a number of resources on RStudio Cloud. For example, on YouTube channel, there is RStudio Cloud for Education https://www.youtube.com/watch?v=PviVimazpz8. Another good resource on YouTube is Working with R in Cloud https://www.youtube.com/watch?v=SFpzr21Pavg

2.8 Guidance and helps

To see further guidance and help, users may register and join RStudio Community at RStudio Community. Users can also ask questions on Stack Overflow. There are also mailing list groups on specific topics but users have to subscribe to it.

2.9 Bookdown

RStudio has provided a website to host online books, the Bookdown. Th books at Bookdown are freely accessible onlineand some of the books are available on Amazon or other book depository as physical books. +

Bookdown

2.10 Summary

In this chapter, we describe R scripts and R packages. We also show how to write simple R scripts and how to check if any specific R package is available on your machine and how to install it if it is not available. We recommend using RStudio Cloud if you are very new to R. Working directory sometimes confuses new R users, hence we also recommend all R users to create new RStudio Project for new analysis task. There are resources available offline and online and many of them are freely accessible especially at the bookdown website.