## 2.3 Accessing and manipulating variables

Now that we have described the main objects we will work with in R, we can discuss how to access specific information.

### 2.3.1 Accessing a single element

Given a vector `vec`

we can access its i-th entry with `vec[i]`

.

```
<- c(1,3,5)
vec 2] vec[
```

`## [1] 3`

For a matrix or a dataframe we need to specify the associated row and column. If we have a matrix `mat`

we can access the element in entry (i,j) with `mat[i,j]`

.

```
<- matrix(c(1,2,3,4,5,6,7,8,9), ncol=3, nrow =3)
mat 1,3] mat[
```

`## [1] 7`

### 2.3.2 Acessing multiple entries

To access multiple entries we can on the other hand define a vector of indexes of the elements we want to access. Consider the following examples:

```
<- c(1,3,5)
vec c(1,2)] vec[
```

`## [1] 1 3`

The above code accesses the first two entries of the vector `vec`

. To do this we had to define a vector using `c(1,2)`

stating the entries we wanted to look at. For matrices consider:

```
<- matrix(c(1,2,3,4,5,6,7,8,9), ncol=3, nrow =3)
mat c(1,2),c(2,3)] mat[
```

```
## [,1] [,2]
## [1,] 4 7
## [2,] 5 8
```

The syntax is very similar as before. We defined to index vectors, one for the rows and one for columns. The two statements `c(1,2)`

and `c(2,3)`

are separated by a comma to denote that the first selects the first and second row, whilst the second selects the second and third column.

If one wants to access full rows or full columns, the argument associated to rows or columns is left blank. Consider the following examples.

```
<- matrix(c(1,2,3,4,5,6,7,8,9), ncol=3, nrow =3)
mat 1,] mat[
```

`## [1] 1 4 7`

`c(1,2)] mat[,`

```
## [,1] [,2]
## [1,] 1 4
## [2,] 2 5
## [3,] 3 6
```

The code `mat[1,]`

selects the first full row of `mat`

. The code `mat[,c(1,2)]`

selects the first and second column of `mat`

. Notice that the comma has always to be included!

To access multiple entries it is often useful to define sequences of number quickly. The following command defines the sequence of integer numbers from 1 to 9.

`1:9`

`## [1] 1 2 3 4 5 6 7 8 9`

More generally, one can define sequences of numbers using `seq`

(see `?seq`

).

### 2.3.3 Accessing entries with logical operators

If we want to access elements of an object based on a condition it is often easier to use logical operators. This means comparing entries using the comparisons you would usually use in mathematical reasoning, for instance being equal to, or being larger to. The syntax is as follows:

`==`

to check equality (notice the two equal signs)`!=`

to check non-equality`>`

bigger to`>=`

bigger or equal to`<`

less to`<=`

less or equal to

Let’s see some examples.

```
<- c(2,3,4,5,6)
vec > 4 vec
```

`## [1] FALSE FALSE FALSE TRUE TRUE`

We constructed a vector `vec`

and check which entries were larger than 4. The output is a Boolean vector with the same number of entries as `vec`

where only the last two entries are `TRUE`

. Similarly,

```
<- c(2,3,4,5,6)
vec == 4 vec
```

`## [1] FALSE FALSE TRUE FALSE FALSE`

has a `TRUE`

in the third entry only.

So if we were to be interested in returning the elements of `vec`

that are larger than 4 we could use the code

```
<- c(2,3,4,5,6)
vec > 4] vec[vec
```

`## [1] 5 6`

So we have a vector with only elements 5 and 6.

### 2.3.4 Manipulating dataframes

We have seen in the previous section that dataframes are special types of matrices where columns can include a different data type. For this reason they have special way to manipulate and access their entries.

First, specific columns of a dataframe can be accessed using its name and the `$`

sign as follows.

```
<- data.frame(X1 = c(1,2,3), X2 = c(TRUE,FALSE,FALSE),
data X3 = c("male","male","female"))
$X1 data
```

`## [1] 1 2 3`

`$X3 data`

```
## [1] male male female
## Levels: female male
```

So using the name of the dataframe `data`

followed by `$`

and then the name of the column, for instance `X1`

, we access that specific column of the dataframe.

Second, we can use the `$`

sign to add new columns to a dataframe. Consider the following code.

```
<- data.frame(X1 = c(1,2,3), X2 = c(TRUE,FALSE,FALSE),
data X3 = c("male","male","female"))
$X4 <- c("yes","no","no")
data data
```

```
## X1 X2 X3 X4
## 1 1 TRUE male yes
## 2 2 FALSE male no
## 3 3 FALSE female no
```

`data`

now includes a fourth column called `X4`

coinciding to the vector `c("yes","no","no")`

.

Third, we can select specific rows of a dataframe using the command `subset`

. Consider the following example.

```
<- data.frame(X1 = c(1,2,3), X2 = c(TRUE,FALSE,FALSE),
data X3 = c("male","male","female"))
subset(data, X1 <= 2)
```

```
## X1 X2 X3
## 1 1 TRUE male
## 2 2 FALSE male
```

The above code returns the rows of `data`

such that `X1`

is less or equal to 2. More complex rules to subset a dataframe can be combined using the and operator `&`

and the or operator `|`

. Let’s see an example.

```
<- data.frame(X1 = c(1,2,3), X2 = c(TRUE,FALSE,FALSE),
data X3 = c("male","male","female"))
subset(data, X1 <= 2 & X2 == TRUE)
```

```
## X1 X2 X3
## 1 1 TRUE male
```

So the above code selects the rows such that `X1`

is less or equal to 2 and `X2`

is `TRUE`

. This is the case only for the first row of `data`

.

### 2.3.5 Information about objects

Here is a list of functions which are often useful to get information about objects in R.

`length`

returns the number of entries in a vector.`dim`

returns the number of rows and columns of a matrix or a dataframe`unique`

returns the unique elements of a vector or the unique rows of a matrix or a dataframe.`head`

returns the first entries of a vector or the first rows of a matrix or a dataframe`order`

returns a re-ordering of a vector or a data.frame in ascending order.

Let’s see some examples.

```
<- c(4,2,7,5,5)
vec length(vec)
```

`## [1] 5`

`unique(vec)`

`## [1] 4 2 7 5`

`order(vec)`

`## [1] 2 1 4 5 3`

`length`

gives the number of elements of `vec`

, `unique`

returns the different values in `vec`

(so 5 is not repeated), `order`

returns in entry i the ordering of the i-th entry of `vec`

. So the first entry of `order(vec)`

is 2 since 4 is the second-smallest entry of `vec`

.

```
<- data.frame(X1 = c(1,2,3,4), X2 = c(TRUE,FALSE,FALSE,FALSE),
data X3 = c("male","male","female","female"))
dim(data)
```

`## [1] 4 3`

So `dim`

tells us that `data`

has four rows and three columns.