2.3 Accessing and manipulating variables
Now that we have described the main objects we will work with in R, we can discuss how to access specific information.
2.3.1 Accessing a single element
Given a vector
vec we can access its i-th entry with
<- c(1,3,5) vec 2]vec[
##  3
For a matrix or a dataframe we need to specify the associated row and column. If we have a matrix
mat we can access the element in entry (i,j) with
<- matrix(c(1,2,3,4,5,6,7,8,9), ncol=3, nrow =3) mat 1,3]mat[
##  7
2.3.2 Acessing multiple entries
To access multiple entries we can on the other hand define a vector of indexes of the elements we want to access. Consider the following examples:
<- c(1,3,5) vec c(1,2)]vec[
##  1 3
The above code accesses the first two entries of the vector
vec. To do this we had to define a vector using
c(1,2) stating the entries we wanted to look at. For matrices consider:
<- matrix(c(1,2,3,4,5,6,7,8,9), ncol=3, nrow =3) mat c(1,2),c(2,3)]mat[
## [,1] [,2] ## [1,] 4 7 ## [2,] 5 8
The syntax is very similar as before. We defined to index vectors, one for the rows and one for columns. The two statements
c(2,3) are separated by a comma to denote that the first selects the first and second row, whilst the second selects the second and third column.
If one wants to access full rows or full columns, the argument associated to rows or columns is left blank. Consider the following examples.
<- matrix(c(1,2,3,4,5,6,7,8,9), ncol=3, nrow =3) mat 1,]mat[
##  1 4 7
## [,1] [,2] ## [1,] 1 4 ## [2,] 2 5 ## [3,] 3 6
mat[1,] selects the first full row of
mat. The code
mat[,c(1,2)] selects the first and second column of
mat. Notice that the comma has always to be included!
To access multiple entries it is often useful to define sequences of number quickly. The following command defines the sequence of integer numbers from 1 to 9.
##  1 2 3 4 5 6 7 8 9
More generally, one can define sequences of numbers using
2.3.3 Accessing entries with logical operators
If we want to access elements of an object based on a condition it is often easier to use logical operators. This means comparing entries using the comparisons you would usually use in mathematical reasoning, for instance being equal to, or being larger to. The syntax is as follows:
==to check equality (notice the two equal signs)
!=to check non-equality
>=bigger or equal to
<=less or equal to
Let’s see some examples.
<- c(2,3,4,5,6) vec > 4vec
##  FALSE FALSE FALSE TRUE TRUE
We constructed a vector
vec and check which entries were larger than 4. The output is a Boolean vector with the same number of entries as
vec where only the last two entries are
<- c(2,3,4,5,6) vec == 4vec
##  FALSE FALSE TRUE FALSE FALSE
TRUE in the third entry only.
So if we were to be interested in returning the elements of
vec that are larger than 4 we could use the code
<- c(2,3,4,5,6) vec > 4]vec[vec
##  5 6
So we have a vector with only elements 5 and 6.
2.3.4 Manipulating dataframes
We have seen in the previous section that dataframes are special types of matrices where columns can include a different data type. For this reason they have special way to manipulate and access their entries.
First, specific columns of a dataframe can be accessed using its name and the
$ sign as follows.
<- data.frame(X1 = c(1,2,3), X2 = c(TRUE,FALSE,FALSE), data X3 = c("male","male","female")) $X1data
##  1 2 3
##  male male female ## Levels: female male
So using the name of the dataframe
data followed by
$ and then the name of the column, for instance
X1, we access that specific column of the dataframe.
Second, we can use the
$ sign to add new columns to a dataframe. Consider the following code.
<- data.frame(X1 = c(1,2,3), X2 = c(TRUE,FALSE,FALSE), data X3 = c("male","male","female")) $X4 <- c("yes","no","no") datadata
## X1 X2 X3 X4 ## 1 1 TRUE male yes ## 2 2 FALSE male no ## 3 3 FALSE female no
data now includes a fourth column called
X4 coinciding to the vector
Third, we can select specific rows of a dataframe using the command
subset. Consider the following example.
<- data.frame(X1 = c(1,2,3), X2 = c(TRUE,FALSE,FALSE), data X3 = c("male","male","female")) subset(data, X1 <= 2)
## X1 X2 X3 ## 1 1 TRUE male ## 2 2 FALSE male
The above code returns the rows of
data such that
X1 is less or equal to 2. More complex rules to subset a dataframe can be combined using the and operator
& and the or operator
|. Let’s see an example.
<- data.frame(X1 = c(1,2,3), X2 = c(TRUE,FALSE,FALSE), data X3 = c("male","male","female")) subset(data, X1 <= 2 & X2 == TRUE)
## X1 X2 X3 ## 1 1 TRUE male
So the above code selects the rows such that
X1 is less or equal to 2 and
TRUE. This is the case only for the first row of
2.3.5 Information about objects
Here is a list of functions which are often useful to get information about objects in R.
lengthreturns the number of entries in a vector.
dimreturns the number of rows and columns of a matrix or a dataframe
uniquereturns the unique elements of a vector or the unique rows of a matrix or a dataframe.
headreturns the first entries of a vector or the first rows of a matrix or a dataframe
orderreturns a re-ordering of a vector or a data.frame in ascending order.
Let’s see some examples.
<- c(4,2,7,5,5) vec length(vec)
##  5
##  4 2 7 5
##  2 1 4 5 3
length gives the number of elements of
unique returns the different values in
vec (so 5 is not repeated),
order returns in entry i the ordering of the i-th entry of
vec. So the first entry of
order(vec) is 2 since 4 is the second-smallest entry of
<- data.frame(X1 = c(1,2,3,4), X2 = c(TRUE,FALSE,FALSE,FALSE), data X3 = c("male","male","female","female")) dim(data)
##  4 3
dim tells us that
data has four rows and three columns.