# C Programming Concepts

Although programming is not the focus of this book, there are several basic programming concepts that are useful for working with data in R. These concepts will be demonstrated in this short appendix.

## C.1 Conditional Statements

In general terms, **conditional statements** are used to control the flow of a program, depending on whether specified conditions are met. In this section we will explore two of the more common types of conditional statements. Although we present this material in the context of the R language, these concepts apply to most other languages as well (such as Python).

### C.1.1 `for`

Loops

A **loop** allows for a set of commands to be repeated under a specific set of conditions; the `for`

loop is one of the several types of loops available in R. A for loop has the following basic structure:

for(

counter){}

instructions

The commands that you write in the `instructions`

section of the for loop are executed multiple times based on the value of `counter`

. In the code chunk below, the index variable \(i\) takes on the values 1 through 5, so the body of the for loop (the command to print the number \(i\)) is executed 5 times. **Note** that we can use the colon notation `a:b`

to get all integers between `a`

and `b`

(inclusive).

```
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
```

Now we’ll try something a litte more complicated - we’ll use a for loop to sum all of the numbers between 1 and 100. To start, we need to **initialize** the `sum`

variable so that it equals 0. We will then use a for loop with 100 iterations to calculate a rolling sum between 1 and 100.

`## [1] 5050`

This works according to the following logic:

- Initialize
`sum`

so that it starts equal to 0. - Enter the for loop; in the first iteration \(i\) is equal to 1.
- Re-assign
`sum`

so that it equals its current value plus the value of \(i\). This means`sum`

now equals 0 + 1 = 1. - Start the next iteration of the for loop, where \(i\) equals 2.
- Re-assign
`sum`

so that it equals its current value plus the value of \(i\). This means`sum`

now equals 1 + 2 = 3. - Continue this until the 100th iteration of the loop.

In the above for loop, we did not store the result at each iteration of the loop. This means that we cannot go back and access the sum at, say, the 40th iteration; we can only see the sum after all 100 iterations of the loop are finished. For some types of problems, we want to be able to go back and access the results at all iterations of the loop.

To show this, let’s calculate the squares of all integers from 1 to 5. This time we start by initializing an empty vector called `squares`

, which is done using `c()`

. This empty vector will store the results from each iteration of the loop. The `append()`

function is used to add each result to the end of this vector as we iterate over the values one through five.

`## [1] 1 4 9 16 25`

This works according to the following logic:

- Initialize
`squares`

so that it is an empty vector. - Enter the for loop; in the first iteration \(i\) is equal to 1.
- Calculate 1 squared (with the command “i^2”) and append it to
`squares`

. - Start the next iteration of the for loop, where \(i\) equals 2.
- Calculate 2 squared (with the command “i^2”) and append it to
`squares`

. - Continue this until the 5th iteration of the loop.

### C.1.2 `if/else`

Statements

Often we encounter situations where we would like to run some code *if* a condition is `TRUE`

, or run different code if the condition is `FALSE`

. For these situations we need to use `if/else`

statements, which take the general form:

if(

condition1){

code block 1} else if (

condition2){

code block 2} else{

}

code block 3

If `condition1`

is true, then R runs the code in `code block 1`

and ignores `code block 2`

and `code block 3`

. If `condition1`

is not true and `condition2`

is true, R skips `code block 1`

and `code block 3`

and runs `code block 2`

. Finally, if neither `condition1`

nor `condition2`

are true, R runs `code block 3`

. For example:

```
values <- c(-5, -2, 0, 1)
for (value in values){
if (value < 0){
print("Negative")
} else if (value == 0){
print("Zero")
} else {
print("Positive")
}
}
```

```
## [1] "Negative"
## [1] "Negative"
## [1] "Zero"
## [1] "Positive"
```

Note that you do not need to include an `else if`

statement if you only have one condition to evaluate:

```
values <- c(-5, -2, 0, 1)
for (value in values){
if (value < 0){
print("Negative")
} else {
print("Not Negative")
}
}
```

```
## [1] "Negative"
## [1] "Negative"
## [1] "Not Negative"
## [1] "Not Negative"
```

## C.2 Functions

Throughout the book, we have seen many examples of built-in R functions. However, we can also define our own functions! Imagine we wanted to calculate the compound interest on an investment with the following formula:

\[A = P(1 + \frac{r}{n})^{nt}\]

\(A =\) final amount

\(P =\) principal balance

\(r =\) interest rate

\(n =\) number of times interest is applied per time period

\(t =\) number of time periods

Of course, we could write out the formula arithmetically every time we wanted to calculate compound interest. Let’s say \(P = \$10,000\), \(r = 0.10\), \(n = 12\), and \(t = 5\). Using the formula above:

`## [1] 16453.09`

Now imagine we wanted to calculate compound interest on many different investments and compare them. We could copy-and-paste the code above, each time changing the values of \(P\), \(r\), \(n\), and \(t\). However, imagine after doing this we realized that there was a mistake in our original formula. We would then need to go back and fix that mistake in every line of code that we copy-and-pasted from the original. We can prevent this headache by defining our own function to calculate compound interest, and then applying that function many times. If we notice a mistake in our formula, we simply need to fix it in the definition of the function and *not* in every single line of code.

If you find yourself repeatedly copy-and-pasting a chunk of code, this is a good sign that you should define a function.

How can we define our own functions in R? Function definitions take the following form:

function_name <- function(arg1, arg2, …){

…code block…

return(result)

}

*Required*`function_name`

: The name of our new function. Function names follow the same basic naming rules that we saw in Section 2.`...code block...`

: The code we want the function to apply. This is where we will write the compound interest formula.

*Optional*`arg1, arg2, ...`

: Any arguments we want the function to accept. In our compound interest example, we want the function to accept the arguments \(P\), \(r\), \(n\), and \(t\). Note that arguments are optional, so a function can take no inputs.`return(result)`

: Any values or objects we want the function to return. In our compound interest example, we will return the result of the compound interest calculation. Note that this is optional, so a function does not need to return anything.

Now let’s create a function called `compound_interest()`

. Following the syntax shown above, we can define this function as follows:

We can then apply our new function to calculate compound interest:

`## [1] 16453.09`

*Colaboratory: Frequently Asked Questions*. 2021. 1600 Amphitheatre Parkway, Mountain View, California, United States: Google. https://research.google.com/colaboratory/faq.html.

Jupyter Project and Community. 2021. *About Us*. Project Jupyter. https://jupyter.org/about.

R Core Team. 2020. *R: A Language and Environment for Statistical Computing*. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.

RStudio Team. 2021. *RStudio: Integrated Development Environment for R*. Boston, MA: RStudio, PBC. http://www.rstudio.com/.

Wasserstein, et al., Ronald L. 2019. “Moving to a World Beyond ‘P < 0.05.” *The American Statistician*. https://doi.org/10.1080/00031305.2019.1583913.

Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” *Journal of Open Source Software* 4 (43): 1686. https://doi.org/10.21105/joss.01686.

Wickham, Hadley, and Jennifer Bryan. 2019. *Readxl: Read Excel Files*. https://CRAN.R-project.org/package=readxl.