4 Fundamentals
This is Chapter covers the basics of R and RStudio
4.1 Syntax
Plain text End a line with two spaces to start a new paragraph. italics and italics bold and bold superscript2 strikethrough link # Header 1 (#) ## Header 2 (##) ### Header 3 (###) #### Header 4 (####) ##### Header 5 (#####) ###### Header 6 (######) endash: -- emdash: --- ellipsis: ... inline equation: A=π∗r2 image:
horizontal rule (or slide break): ** > block quote unordered list * item 2 + sub-item 1 + sub-item 2
- ordered list
- item 2
- sub-item 1
- sub-item 2 Table Header | Second Header ------------- | ------------- Table Cell | Cell 2 Cell 3 | Cell 4
4.2 Vectors
A vector is a collection of elements of the same mode :
v.n <- c(3,4,5,6, NA)
v.c <- c("Tom","Jim","Tim")
v.l <- c(TRUE,TRUE,FALSE)
#Missing value is coded as NA
We can create a vector by using the c function (concatenation), or functions seq & rep
v.n1 <- rep(2, 4)
v.n2 <- rep(v.n, 4)
v.n3 <- rep(v.n, each=4)
v.n4 <- seq(from=3, to=10, length=10)
v.n5 <- seq(from=3, to=10, by=0.5)
v.n6 <- 1:10
4.3 Matrix
The R function matrix creates a matrix :
m1 <- matrix(rnorm(12), 3, 4)
Other functions for creating a matrix
v1 <- runif(10)
v2 <- rnorm(10)
m1 <- cbind(v1, v2)
m2 <- rbind(v1, v2)
4.4 Dataframe
Data frame is probably the most commonly used data object. It is in the form of a matrix but with a mode of list. Each column is a variable, each row is an observation. A column can be numeric, characters, or logic. Each column has its unique name
m1 <- matrix(rnorm(6),2,3,byrow=T)
m2 <- rbind(m1,c(1,1,2))
m2 <- cbind(m2, c(1,1,2))
d1 <- data.frame(m2)
d2 <- data.frame(v1=rnorm(3), v2=runif(3))
names(d2)
## [1] "v1" "v2"
A data frame is a list. A list is a collection of elements (just like a vector),but the elements of a list can be of different mode :
l1 <- list(c(1,2,3), matrix(rnorm(9), 3, 3),
c("Tim","Tom","Jim"))
l1
## [[1]]
## [1] 1 2 3
##
## [[2]]
## [,1] [,2] [,3]
## [1,] 1.2735046 1.2839747 -1.9313985
## [2,] -0.4933836 -1.1187934 0.6498940
## [3,] 1.2601138 0.0290398 0.1638518
##
## [[3]]
## [1] "Tim" "Tom" "Jim"
A data frame is a list of vectors of the same length. When importing from a spreadsheet file, the default format is data frame.
4.5 Importing Data
The basic R function for reading text data is scan. The most useful function is read.table or read.csv. When using read.table the text file is imported into a data frame.
4.6 Types of data
4.6.1 Variables
- Integer (ex. 100)
- Numeric (ex. 0.05)
- Character (ex. "hello")
- Logical (ex. TRUE)
- Factor (ex. "Green")
4.6.2 Types of data objects
- Vector
- Matrix
- List and dataframe
- Array
4.6.2.1 Numeric vectors
x <- c(2, 6, 1, 5, 2.5)
y <- c(0, 6, 3, 2.6, 9.4)
x[3] #Accessing elements of a vector
## [1] 1
4.6.2.2 Vector operations
z <- x + y #sum by elements
z[2] #the second element
## [1] 12
z[-2] #all but the second element
## [1] 2.0 4.0 7.6 11.9
z[c(2,4)] #the second and the fourth elements
## [1] 12.0 7.6
z[c(2:4)] #elements 2 to 4
## [1] 12.0 4.0 7.6
z[-c(2:4)] #all except elements 2 to 4
## [1] 2.0 11.9