In this document, we’ll practice working with objects in R. As I said before, We can use <-
or =
for assigning things to objects.
This is a simple assignment. We are assigning an integer to an object.
min_age = 21
min_age
## [1] 21
We are assigning a string (combination of characters) to an object
greeting = "Hi! My name is (what?), My name is (who?), My name is ..."
greeting
## [1] "Hi! My name is (what?), My name is (who?), My name is ..."
The two examples above are objects with one values in them. We can have vectors that store multiple values.
my_num_vec = c(2, 4, 6)
my_char_vec = c("Hubei", "Sichuan")
Because vectors have different elements, we can call each element individually
my_num_vec[1]
## [1] 2
my_char_vec[2]
## [1] "Sichuan"
Note that in R, unlike in Python, the index of the first element of a vector is 1 rather than 1.
my_num_vec[0]
## numeric(0)
We can create a sequential numerical vector by saying
my_seq_1 = c(1:10)
my_seq_2 = c(1:10, 20:25)
How many elements are in the my_seq_2
vector? Be sure to use the length()
function.
Factor objects contain information for categorical variables (e.g. color, shape), where there are a number of possible values the object can take, but these values are limited.
For example, a categorical variable could include the colors of the rainbow. Here, values could be red, orange, yellow, green, blue, indigo, or violet. Thus, values could be one of seven different colors, but the categorical variable is limited to one of these seven values.
colors = c("red", "red", "blue", "red", "blue")
To create a factor object out of this character vector we can use the factor()
function or the as.factor()
function. Let’s try both and look at the objects created.
In using the factor()
function we specify the order
colors_factor1 = factor(colors, levels = c("red", "blue"))
colors_factor1
## [1] red red blue red blue
## Levels: red blue
However, in using the as.factor()
function, the orders are automatically alphabetical
colors_factor2 = as.factor(colors)
colors_factor2
## [1] red red blue red blue
## Levels: blue red
Now, let’s work with data frames:
my_df = data.frame("first_column" = c(1, 2, 3),
"second_column" = c(4, 5, NA))
my_df
## first_column second_column
## 1 1 4
## 2 2 5
## 3 3 NA
We saw that we can call an element of a vector by simply saying my_vec[1]
or my_vec[16]
. Let’s see how we can call elements of a data frame.
Which code returns the values in the second column of the my_df
data frame we created above?
my_df[2, 2]
## [1] 5
my_df[2, ]
## first_column second_column
## 2 2 5
my_df[, 2]
## [1] 4 5 NA
my_df[2]
## second_column
## 1 4
## 2 5
## 3 NA
my_df["second_column"]
## second_column
## 1 4
## 2 5
## 3 NA
my_df[, "second_column"]
## [1] 4 5 NA
Which code returns the values in the second column of the my_df
data frame we created above?
my_df[2, ]
## first_column second_column
## 2 2 5
How do we find the dimensions of a data frame just like we used the length()
function for vectors?
dim(my_df)
## [1] 3 2
nrow(my_df)
## [1] 3
ncol(my_df)
## [1] 2
How can we find the column names in a data frame?
colnames(my_df)
## [1] "first_column" "second_column"
Finally, what are lists?
Lists are super useful. They are not quite tables. They can have elements in elements in elements in ….
For instance, this is a list:
about_me <- list(places = c("Tehran, Iran", "Lubbock, TX", "Brooklyn, NY", "Baltimore, MD", "Atlanta, GA"),
ice_creams = c("Vanilla"),
sports = c("Beach Volleyball", "Tennis", "Soccer"),
other_data = my_df)
about_me
## $places
## [1] "Tehran, Iran" "Lubbock, TX" "Brooklyn, NY" "Baltimore, MD"
## [5] "Atlanta, GA"
##
## $ice_creams
## [1] "Vanilla"
##
## $sports
## [1] "Beach Volleyball" "Tennis" "Soccer"
##
## $other_data
## first_column second_column
## 1 1 4
## 2 2 5
## 3 3 NA
The way we call elements of a list is different than data frames. How can I call the third city I lived in?
about_me[[1]][[3]]
## [1] "Brooklyn, NY"
about_me[["places"]][[3]]
## [1] "Brooklyn, NY"
How can I call all the sports I like?
about_me[[3]]
## [1] "Beach Volleyball" "Tennis" "Soccer"
about_me[["sports"]]
## [1] "Beach Volleyball" "Tennis" "Soccer"
How do I call the second column of the data frame?
about_me[[4]][, 2]
## [1] 4 5 NA
about_me[[4]]["second_column"]
## second_column
## 1 4
## 2 5
## 3 NA
about_me[[4]]$second_column
## [1] 4 5 NA
about_me[["other_data"]]["second_column"]
## second_column
## 1 4
## 2 5
## 3 NA