# 4 Fundamentals

This is Chapter covers the basics of R and RStudio

## 4.1 Syntax

1. ordered list
2. item 2
• sub-item 1
## 4.2 Vectors

A vector is a collection of elements of the same mode :

v.n <- c(3,4,5,6, NA)
v.c <- c("Tom","Jim","Tim")
v.l <- c(TRUE,TRUE,FALSE)
#Missing value is coded as NA

We can create a vector by using the c function (concatenation), or functions seq & rep

v.n1 <- rep(2, 4)
v.n2 <- rep(v.n, 4)
v.n3 <- rep(v.n, each=4)
v.n4 <- seq(from=3, to=10, length=10)
v.n5 <- seq(from=3, to=10, by=0.5)
v.n6 <- 1:10

## 4.3 Matrix

The R function matrix creates a matrix :

m1 <- matrix(rnorm(12), 3, 4)

Other functions for creating a matrix

v1 <- runif(10)
v2 <- rnorm(10)
m1 <- cbind(v1, v2)
m2 <- rbind(v1, v2)

## 4.4 Dataframe

Data frame is probably the most commonly used data object. It is in the form of a matrix but with a mode of list. Each column is a variable, each row is an observation. A column can be numeric, characters, or logic. Each column has its unique name

m1 <- matrix(rnorm(6),2,3,byrow=T)
m2 <- rbind(m1,c(1,1,2))
m2 <- cbind(m2, c(1,1,2))

d1 <- data.frame(m2)
d2 <- data.frame(v1=rnorm(3), v2=runif(3))
names(d2)
##  "v1" "v2"

A data frame is a list. A list is a collection of elements (just like a vector),but the elements of a list can be of different mode :

l1 <- list(c(1,2,3), matrix(rnorm(9), 3, 3),
c("Tim","Tom","Jim"))
l1
## []
##  1 2 3
##
## []
##            [,1]       [,2]       [,3]
## [1,]  1.2735046  1.2839747 -1.9313985
## [2,] -0.4933836 -1.1187934  0.6498940
## [3,]  1.2601138  0.0290398  0.1638518
##
## []
##  "Tim" "Tom" "Jim"

A data frame is a list of vectors of the same length. When importing from a spreadsheet file, the default format is data frame.

## 4.5 Importing Data

The basic R function for reading text data is scan. The most useful function is read.table or read.csv. When using read.table the text file is imported into a data frame.

### 4.5.1 Extract elements

s<-v #extracts the second element of v and stores it to s. v2<-v[2:3] # extracts the 2nd and 3rd elements v3<-v[c(1,3,4)] v4<-m[,2] # putting the 2nd column of matrix or data frame to vector v4 v40] #v5 has all positive values of v v6<-v[!is.na(v)] #all non-missing values

### 4.5.2 Read data from internet

Fixed width format and read.fwf:

read.fwf(file, widths, header = FALSE, sep = "",skip = 0, row.names, col.names, n = -1, buffersize = 2000, ...)

## 4.6 Types of data

### 4.6.1 Variables

1. Integer (ex. 100)
2. Numeric (ex. 0.05)
3. Character (ex. "hello")
4. Logical (ex. TRUE)
5. Factor (ex. "Green")

### 4.6.2 Types of data objects

1. Vector
2. Matrix
3. List and dataframe
4. Array

#### 4.6.2.1 Numeric vectors

x <- c(2, 6, 1, 5, 2.5)
y <- c(0, 6, 3, 2.6, 9.4)

x #Accessing elements of a vector
##  1

#### 4.6.2.2 Vector operations

z <- x + y #sum by elements
z #the second element
##  12
z[-2] #all but the second element
##   2.0  4.0  7.6 11.9
z[c(2,4)] #the second and the fourth elements
##  12.0  7.6
z[c(2:4)] #elements 2 to 4
##  12.0  4.0  7.6
z[-c(2:4)] #all except elements 2 to 4
##   2.0 11.9

#### 4.6.2.3 Vector of logic values:

TRUE and FALSE are logic values

z>10 #compares each element of z to 10 and returns a vector of logic values
##  FALSE  TRUE FALSE FALSE  TRUE

#### 4.6.2.4 Logic comparisons

1. <
2. ==
3. <=
4. =

5. xor
x>y #element-wise comparison
##   TRUE FALSE FALSE  TRUE FALSE

## 4.7 Subsetting vectors

names <- c("oliver", "olivia", "henry","mary",) sex <- c("M","F","M","F",) speed <- c(3.5, 4, 3, 3.25,) names[sex=="F",] speed[sex=="M",] names[speed<=3.5,] z[z>10]

## 4.8 Sorting data

order(names) order(speed)

Names[order(speed)] Names[order(Names)]

## 4.9 Merging vectors

z <- c(x, y) #Adding columns

## 4.10 Generating vectors

c(3,5,6) #3 5 6 2:5 #2 3 4 5 seq(2, 3, by=0.5) #2.0 2.5 3.0 rep(1:2,each=3) #1 1 1 2 2 2 rep(1:2, times=3) #1 2 1 2 1 2