Chapter 19 Gaining more familiarity with R

19.1 Variables and values

Variables are created by assigning a name to a value. Names of variables are case-sensitive and can include letters, numbers, “_” and “.” but cannot begin with a number. Below we demonstrate how to create two variables, “x”, and “z”. Let “x” denote the number of chocolate chips claimed by a bakery to be in a biscuit purchased, and “z” the number of chocolate chips actually in the biscuit.

x <- 8                 # x is assigned the value 8
#x                     # The value of x, the number of chocolate chips advertised

z<-5             # z is assigned the value 5
#z                #  the value of chocolate chips actually found in the biscuit

19.1.1 “Assignment” creates a copy of a value.

y <- x                 # y is assigned the value of x
y <- 0                 # y becomes 0, x is not changed by this
y <- x + 1             # y becomes x+1, x is not changed by this
x <- x + 1             # x becomes x+1, its previous value is lost
                       # the number of choclate chips advertised is now 9! 

19.1.2 Vectors

A numeric vector is an ordered collection of numbers (see “biscuitTin” and “y”below). Here are some examples of how we can manipulate vectors.

 biscuitTin <- c(1, 3.14, 0, -3, 7)  # Function c combines elements into vectors.
# Lets imagine that each element in bisuitTin is the number of chocolate chips in each of five biscuits.
y <- 1:5                   # : makes a numeric sequence in integer steps, 1,2,3,4,5
biscuitTin^2                   # Square each element...essentially we have each element in biscuitTin squared :-)
## [1]  1.0000  9.8596  0.0000  9.0000 49.0000
biscuitTin + 1                 # Add 1 to each element in biscuitTin. (The 1 is recycled)
## [1]  2.00  4.14  1.00 -2.00  8.00
biscuitTin + y                 # Add corresponding pairs of elements
## [1]  2.00  5.14  3.00  1.00 12.00
biscuitTin[1]                # Get the first element of biscuitTin
## [1] 1
biscuitTin[1:3]              # Get the first three elements in biscuitTin
## [1] 1.00 3.14 0.00
# 
# # Negative numbers index elements "not" indexed.
biscuitTin[-1]               # Get all except the first element.  Think of it as removing the first element.
## [1]  3.14  0.00 -3.00  7.00
biscuitTin[-(1:3)]           # Get all except the first three elements
## [1] -3  7
# 
# # Indexing and assignment.
# 
biscuitTin[1:3] <- c(7,8,9)   # Set the values for first three elements
biscuitTin[1:3] <- 0          # Set the first three elements to 0 (recycled)
biscuitTin[-c(1:3)] <- 1      # Set all except the first three elements to 1

biscuitTin[biscuitTin > 0]           # Get elements where biscuitTin>0 is TRUE
## [1] 1 1
biscuitTin <- biscuitTin[biscuitTin > 0]       # Drop elements where biscuitTin<=0
biscuitTin[biscuitTin < 5] = 0       # Set elements to 0 (recycled) where biscuitTin<5 is TRUE

19.2 Some more summary functions

Below are some summary functions performed on our biscuitTin as an example.

# length(biscuitTin)
# min(biscuitTin)
# max(biscuitTin)
# sum(biscuitTin)
# mean(biscuitTin)
# sd(biscuitTin)
# var(biscuitTin)

19.3 Dataframes in R

R includes several data sets. One of the data sets is named iris. Below, we have written some code to help explore the iris dataset. Try these out and see if you can determine what each function does. Iris is a “data frame” with 150 cases (rows) and 5 variables (columns). Notes that the columns of a data frame are vectors or factors.

 # data()
 # help(iris)    # What are the details on this dataset?
 # class(iris)    # What class of object is this?
 # dim(iris)      # Dimensions (number of rows and columns)
 # names(iris)    # Names of the variables (columns)

19.4 Tidyverse

19.5 Data visualization

In this section we will provide some R recipes to create different types of graphs. Here are some examples of the plots that we will help you create. We will include the R code in the coming days. Let us know if there any other examples you’d like to see.