1.7 Object types

Objects in R can be of various types, or classes, and you can use the class function to view an object’s class and one of many ‘is.’ functions test an object’s class.

1.7.1 Numeric single value

A numeric object contains only numbers (e.g., 1, 2, 4.13, pi, etc.).

x <- 5
x
## [1] 5
class(x)
## [1] "numeric"
is.numeric(x)
## [1] TRUE

1.7.2 Character single value

A character string is a set of characters within one pair of quotes and may include spaces. For example, “elevated” and “elevated blood pressure” are each character objects with a single character value (a single string).

y <- "elevated blood pressure"
y
## [1] "elevated blood pressure"
class(y)
## [1] "character"
is.character(y)
## [1] TRUE
is.numeric(y)
## [1] FALSE
is.character(x)  # The numeric object x we created before is still around 
## [1] FALSE

1.7.3 Vector

A vector contains an ordered collection of numbers or character strings indexed by the integers 1, 2, …, n, where n is the length of the vector.

z <- c(5, 8, 12)
z
## [1]  5  8 12
class(z)
## [1] "numeric"
is.vector(z)
## [1] TRUE
z[2] # The second element of the vector
## [1] 8

Vectors can contain either numeric or character values, but not both.

c(5, 8, 12)
## [1]  5  8 12
c("yes", "no")
## [1] "yes" "no"

If you attempt to create a vector containing both numeric and character values, it will convert the numeric values to character.

c(5, "no")
## [1] "5"  "no"

Other ways of creating vectors are the following.

1:7                # Create a sequence of numbers from 1 to 7
## [1] 1 2 3 4 5 6 7
seq(2, 10, by = 2) # Create a sequence of every two numbers from 2 to 10
## [1]  2  4  6  8 10
rep(3, 5)          # Repeat the number 3, 5 times
## [1] 3 3 3 3 3
rep(c(1,2), 3)     # Repeat the vector (1,2) 3 times
## [1] 1 2 1 2 1 2

1.7.4 Factor

A factor is a special kind of vector which contains underlying numeric values 1, 2, …, n, but each of these n values has an associated character label (which may or may not be the numeric value). These labeled values are the levels of the factor. One common use of a factor is to store a categorical variable for use in a data analysis. Once you have created a factor vector with specific levels, no element of that vector can take on a value that is not one of its pre-assigned levels.

You can create a factor from a character vector, and R will assume that the unique values are the labels for the levels.

y <- factor(c("underweight", "underweight", "normal", "overweight", "normal"))
levels(y) # The levels function shows the possible values of a factor
## [1] "normal"      "overweight"  "underweight"
y         # Displaying the factor will also show the levels
## [1] underweight underweight normal      overweight  normal     
## Levels: normal overweight underweight
class(y)
## [1] "factor"

If you want to change the labels, you can do so by assigning a new value to its levels. For example, suppose we want the labels to be capitalized.

levels(y) <- c("Normal", "Overweight", "Underweight")
levels(y)
## [1] "Normal"      "Overweight"  "Underweight"

Alternatively, you can assign new labels to the levels when you create the factor. This has the added advantage of allowing you to decide what order the levels should appear in. When we created the factor, R automatically assigned the levels by taking the unique values of y and putting them in alphabetical order. For various reasons (like making a barchart later) you might want the levels to be in a different order. You can specify the order of the levels when you create the variable, but be careful because if you leave out a value that appears in the data that value will end up set to missing (NA).

In the previous example, we would like the order to be from lower to higher weight.

# Enter ORIGINAL values in levels
# Enter the NEW level labels in labels
# Make sure the orderings of levels and labels correspond
y <- factor(c("underweight", "underweight", "normal", "overweight", "normal"),
            levels = c("underweight", "normal", "overweight"),
            labels = c("Underweight", "Normal", "Overweight"))
levels(y)
## [1] "Underweight" "Normal"      "Overweight"
y
## [1] Underweight Underweight Normal      Overweight  Normal     
## Levels: Underweight Normal Overweight
class(y)
## [1] "factor"

1.7.5 Matrix

A matrix contains a two-dimensional collection of numeric or character values indexed by pairs of integers (i, j).

x <- matrix(c(1,2,3,4,5,6), nrow = 2, ncol = 3)
x
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
class(x)
## [1] "matrix" "array"
z <- matrix(c("a","b","c","d","e","f"), nrow = 2, ncol = 3)
z
##      [,1] [,2] [,3]
## [1,] "a"  "c"  "e" 
## [2,] "b"  "d"  "f"
class(z)
## [1] "matrix" "array"

Notice that the matrix object has 2 classes. It is a matrix, but in some cases will be treated as an array object (see next subsection).

You can combine the same or different matrices by columns (cbind) or rows (rbind), as long as they have the same number of rows or columns, respectively.

x <- matrix(c(1,2,3,4,5,6), nrow = 2, ncol = 3)
z <- matrix(c(6,5,4,3),     nrow = 2, ncol = 1)
x
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
z
##      [,1]
## [1,]    6
## [2,]    5
cbind(x, z)
##      [,1] [,2] [,3] [,4]
## [1,]    1    3    5    6
## [2,]    2    4    6    5
x <- matrix(c(1,2,3,4,5,6), nrow = 2, ncol = 3)
z <- matrix(c(6,5,4),       nrow = 1, ncol = 3)
x
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
z
##      [,1] [,2] [,3]
## [1,]    6    5    4
rbind(x, z)
##      [,1] [,2] [,3]
## [1,]    1    3    5
## [2,]    2    4    6
## [3,]    6    5    4

1.7.6 Array

An array contains an n-dimensional collection of numeric or character values indexed by n-tuples of integers (e.g., (i,j,k) for a 3-dimensional array).

z <- array(c(1,2,3,4,5,6,7,8), dim = c(2,2,2))
z
## , , 1
## 
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4
## 
## , , 2
## 
##      [,1] [,2]
## [1,]    5    7
## [2,]    6    8
class(z)
## [1] "array"

1.7.7 List

A list contains an ordered collection of objects, and the objects in the list can be of different types.

x <- list("5", c(1,2,3), y) # I put y, an object we created earlier, in this list
x
## [[1]]
## [1] "5"
## 
## [[2]]
## [1] 1 2 3
## 
## [[3]]
## [1] Underweight Underweight Normal      Overweight  Normal     
## Levels: Underweight Normal Overweight
class(x)
## [1] "list"

1.7.8 Data frame

A data.frame is a type of list commonly used to store datasets, with a row for each observation and a column for each variable, and each variable can be of a different type (e.g., numeric, character, etc.). When you create a data.frame, you can optionally name the elements.

x <- data.frame(outcome = c(1,0,1,1),
                exposure = c("yes", "yes", "no", "no"),
                age = c(24, 55, 39, 18))
x
##   outcome exposure age
## 1       1      yes  24
## 2       0      yes  55
## 3       1       no  39
## 4       1       no  18
class(x)
## [1] "data.frame"

1.7.9 Converting between types

You can convert between object types using as. functions. For example:

# Convert character to factor
x <- c("a", "g", "b")
as.factor(x)
## [1] a g b
## Levels: a b g
# Convert logical to numeric
# (True changes to 1, False changes to 0)
x <- c(F, T, F)
as.numeric(x)
## [1] 0 1 0