F.3 Essential R Markdown

The range of options available in R Markdown is paradoxical. For beginners, the wealth of commands in the R Markdown Cheatsheet may seem overwhelming. At the same time, more experienced users often complain about the lack of options in R Markdown, especially when comparing it to type-setting systems like LaTeX. R Markdown is a serious attempt at hitting the sweet spot between simplicity and complexity: It aims to provide a limited set of features that are good enough for most purposes, but leaves more exotic tasks to more specialized systems.

Fortunately, the range of commands needed to benefit from R Markdown is very limited. Any R Markdown document consists of 3 parts:

  1. A header for setting global document options,

  2. Text that may contain headings, paragraphs, and itemized lists, and

  3. Code chunks that contain and evaluate R code.

The small set of commands (introduced in the following sections) will allow you to create fancy data science reports that will impress your instructors and employers long before you reach the end of R Markdown.

F.3.1 Header

The header of a new .Rmd document can contain many technical things, but should primarily contain the following 4 elements:

  1. The title of your report,

  2. the name of its author(s),

  3. the current date,

  4. the output format of the document.

Different options for output formats exist, but .html tends to be the most convenient and portable one.

F.3.2 Text

The body of an .Rmd document is essentially a plain text document that accepts R Markup commands to structure and format text elements. R Markdown is similar to many other markup languages (e.g., HTML, XML, and TeX) by using special symbols to signal formatting commands. While using HTML or TeX can require very complicated commands and tags, the basic idea of Markdown is to provide a simple and limited set of commands that are satisfy the most common formatting needs. Good examples for simple solutions that are intuitive and work well is how R Markdown structures headings, signals emphasis, and allows creating itemized lists.

Headings

Headings indicate the structure of your document by separating it into multiple sections. An unfortunate coincidence that may confuse you for a minute is that R Markdown uses the symbol # (i.e., the hash or pound symbol) as a prefix to section titles. Obviously, this differs from the use of the symbol # in R code, where it is used as a prefix for commenting out lines.

A main (or 1st level) heading entitled Literature Review would be indicated by typing # Literature Review (and capitalized in APA format). Lower-level subsections (and sub-sub-sections, etc.) can then be created by using 2 (3, or more) # symbols:

# 1st Level Heading

## 2nd level heading

### 3rd level heading

It is important to remain consistent about the number of levels within a document. Introducing a lower level of subsections typically only makes sense when there are at least two headings on that level.

Formatting text

Text formatting — like typing words in italics or bold — is easy as well.

Lists

A neat feature of R Markdown is that it makes creating itemized (bullet-point or enumerated) lists very easy.

Bullet-point lists

To create a bullet-point list, simply prefix lines of text by the dash - (or asterix *) symbol. For instance, typing:

will show up in the output file as:

  • First point
  • Second point
    • Subpoint A
    • Subpoint B
  • Third point

Provided that items on the 2nd level are preceded by at least 4 spaces, they are automatically distinguished from the items on the 1st level.

Enumerated lists

To create an enumerated-list, simply prefix lines of text by n. (with n being a number). For instance, typing

will show up in the output file as:

  1. First point
  2. Second point
    1. Subpoint A
    2. Subpoint B
  3. Third point

Again, sub-items are distinguished from top-level items by indenting them (by at least 4 spaces). Note that we were lazy and used 1. on every line, but R Markdown was smart enough to count our items and sub-items.

Line vs. paragraph breaks

A subtle but important feature of R Markdown is its use of blank lines (i.e., lines that only contain spaces and/or a line break, typically typed by hitting Enter). Three concepts to distinguish here are line breaks, paragraph breaks, and manual line breaks:65

  • A regular line break in the .Rmd input file appears as a space in the output document. Thus, R Markdown — like many other markup languages — interprets line breaks as spaces, rather than as the beginning of a new paragraph.66

  • To indicate an actual paragraph break, we need to insert one or more empty lines between strings of text. When getting used to this, inserting empty lines between different parts (e.g., between headings, lines of text, and code chunks) is a convenient and useful way to structure a document.67

  • An occasionally confusing feature of R Markdown is that ending a line with two or more spaces forces a manual line break. Thus, typing 3 words on 3 lines, like:

could either be rendered as:

ene mene miste

or as:

ene
mene
miste

in the output file, depending on the number of (invisible) spaces behind each line. (Specifically, the three words are rendered in the same line of text if there are 0 or 1 spaces after each word, and into separate lines of text when there are 2 or more spaces after each word.)

It is annoying that — due to the invisible nature of spaces — this difference is not visible in the source document. Alas, even R Markdown is not perfect. Fortunately, human beings can adapt to various circumstances and constraints. To get started, just remember that 2 spaces force a line break and separate different paragraphs of text by blank lines.

F.3.3 Code chunks

All commands mentioned in this section so far were Markdown commands that provide a nifty notepad, but have not yet required any R code. The real fun starts when mixing text with code.

Creating chunks

To signal the switch from text to code, insert a code chunk by typing the chunk delimiting symbols ```{r} (to start a chunk) and ``` (to end it). In R Studio, using the keyboard shortcut Cmd + Alt + I immediately yields a new empty chunk that accepts R code:

```{r}  
     
```  

Everything that is contained within this chunk works exactly like an R script. This means that — within a chunk or a sequence of chunks — you can define objects, provide comments (now using # as the comment symbol again), and be evaluated line by line, just in any ordinary R script. When knitting the document, all code chunks are evaluated and the results are displayed (unless you select chunk options that prevent this).

Importantly, any code chunk needs both a beginning (```{r}) and an end (```). R Studio typically shows any text on a white background and code chunks on a grey background. If this changes anywhere in your document, chances are that you opened but forgot to close a chunk. To avoid this error, get into the habit of always writing both parts when creating a new chunk.

Chunk options

Chunks can be named and their default behavior can be changed by setting many options. To get started, it makes sense to provide a unique name for each chunk (e.g., to facilitate navigation in large documents and obtain more informative error messages when making a mistake) and to restrict one’s use of chunk options to:

  • echo: Show the code chunk in the output?
    The default setting is echo = TRUE, but we can use echo = FALSE to hide the code in the output document.

  • eval: Evaluate the code chunk when creating the output?
    The default setting is eval = TRUE, but we can use eval = FALSE to prevent R from evaluating the chunk (e.g., when this would yield an error).

For instance, creating a new chunk:

```{r plot_cars, echo = TRUE, eval = FALSE}   
 plot(cars,   
      xlab = "Speed (mph)",    
      ylab = "Stopping distance (ft)")   
```   

will show the chunk plot_cars in the output document (due to the option echo = TRUE), but not create or show the corresponding plot (due to eval = FALSE).

Some chunks (e.g., an initial one loading required packages like those of the tidyverse) may create messages or warnings that you may wish to hide in your output document. Using the chunk options message and warning to FALSE lets you accomplish this:

```{r load_pkg, message = FALSE, warning = FALSE}
library(tidyverse)
``` 

The full list of chunk options is long (e.g., see http://yihui.name/knitr/options/), but most people (including many users of R Markdown) live quite happily without ever using them.

Inline chunks

A second way of evaluating R code in R Markdown is to directly embed it into the text by `r `.

For instance,

  • `r v <- 1:3; sum(v)` evaluates to 6 in the output document, and

  • `r nrow(cars)` determines that the cars dataset contains 50 rows.

Inline chunks typically contain very brief R commands, but can be immensely useful for characterizing datasets or mentioning the values of results (e.g., means, SD, or \(p\) values computed in an analysis).

F.3.4 Advanced features

R Markdown supports a range of more sophisticated features. But as many of those are not needed at first, we only mention some common ones:

  1. Images can be included by ![Image caption](path/file.ext).
    For instance, the expression ![Data science for psychologists](./images/logo.png) results in:
Data science for psychologists

Data science for psychologists

  1. Mathematical formulas can be enclosed in $$ and are written in LaTeX-style.
    For instance, the expression $$\bar{x} = \frac{1}{n} \cdot \sum_{i=1}^{n} x_{i}$$ yields:

\[\bar{x} = \frac{1}{n} \cdot \sum_{i=1}^{n} x_{i}\]

  1. External links are entered as [Link text](URL).
    For instance, the expression [ds4psy](https://bookdown.org/hneth/ds4psy/) yields ds4psy.

  2. Footnotes can be included by ^[Footnote text.] and will be numbered automatically.68

  3. Please distinguish between hyphens and various types of longer dashes:

    • a hyphen is typed as - and displayed as -;
    • an en-dash is typed as -- and displayed as –;
    • an em-dash is typed as --- and displayed as —;
    • a minus sign should have the same width as a plus. They are typed as $+$/$-$ and displayed as \(+\)/\(-\).
  4. A non-breaking space avoids line-breaks at positions that are printed as spaces, but should not be broken into separate lines. For instance, references to Table X, Figure Y, and Appendix Z should never be broken apart in line endings. To type a non-breaking space in R Markdown, use the html-command &nbsp;.

F.3.5 Common errors

Common errors of novice R Markdown users include the following:

  • Erroneous R code or missing objects: If your code contains errors or tries to use missing objects it will fail to knit.

  • Repeating chunk names: Every named chunk must have a unique name.

  • Including R interface commands: Commands that call the R help system (like ?mtcars or ?c) or show a table in a tab (like View(mtcars)) will typically fail to knit and yield an error. Comment them out in your code prior to knitting (or set eval = FALSE for the corresponding chunk, if its results are not needed elsewhere).

An error will typically prevent your .Rmd file from knitting. When an error occurs, try to understand its error message and correct it in your .Rmd file. If this fails, enter it into a search machine to check how others have dealt with the same problem.

F.3.6 R Studio nuggets

Once you are comfortable with the basics of writing text and code in the same document and knitting it to create an output file, it may be worthwile to spend a few minutes exploring the interface options provided by the R Studio IDE. For instance, any newly-created code chunk immediately provides 3 small buttons in its top right corner:

  1. The symbol for the first button looks like a gear wheel and lets you name the cunk, as well as set the most common chunk options. Applying a few of the pre-defined settings and observing their effects on the definition of the chunk (in curly braces {} on the left) is a convenient way to learn more about chunk options.

  2. Clicking on the downward-triangle evaluates all code chunks above the current one. This is useful when current R objects depend on those in previous chunks and may have changed — or you just took a break and are starting a new session at an advanced position in a file.

  3. Clicking on the right-facing triangle lets you run all code in the corresponding chunk. This is useful for evaluating multiple code steps that are build up throughout larger chunks. The same effect can also be achieved by using the Cmd + Shift + Enter keyboard shortcut from anywhere within the chunk.

Using R Markdown in R Studio also takes the notion of foldable sections to a new level. Not only can you open and close code chunks (by clicking on the small triangle to the left of each chunk or entering Cmd + Alt + L and Cmd + Alt + Shift-L), but the concept of folds also extends to the text sections structured by different levels of headings. Thus, getting into a habit of using the Cmd + Alt + (Shift) + O and Cmd + Alt + (Shift) + L keyboard shortcuts will make it much easier to navigate large and complex documents.

Finally, using keyboard shortcuts for your 7±2 most frequent commands is likely to save you hours on this course, let alone the time-saving benefits for the rest of your life. Help on these shortcuts is available in R Studio via Alt + Shift + K or by selecting Help > Keyboard Shortcuts Help.

F.3.7 Mixing markup languages

Another powerful feature of R Markdown is that we can include and mix commands from other markup languages. For instance, I often use HTML to insert comments, images, or special symbols (e.g., &plusmn; to show the ‘±’ symbol in the previous section or &nbsp; to enter the non-breaking spaces that are essential for any serious typesetting effort) or LaTeX commands to enter mathematical symbols or formulas (e.g., \(\sum_{i=1}^{n} x_{i}\)):

logo

<!-- A comment. -->

<!-- An HTML image with a link: -->    
<a href="https://bookdown.org/hneth/ds4psy/">
<img src = "./images/logo.png" alt = "logo" style = "width: 100px; float: right;"/>
</a>

<!-- A LaTeX formula: --> 
$\sum_{i=1}^{n} x_{i}$

  1. These distinctions apply to any type-setting system, not R Markdown in particular. In most systems, the difference between line breaks and paragraph breaks are somewhat blurred, leading to many typographical errors (not to mention the wide-spread ignorance regarding different dashes and non-breaking spaces). Unfortunately, typing a non-breaking space in R Markdown — which is required to type expressions or names that should never be split into separate lines (like “R Markdown” or “Figure X”) — is not easy. In this book, I use the HTML command &nbsp;.

  2. This may seem confusing at first, but actually helps structuring arguments while writing them.

  3. Especially when working on laptops, students initially try to fit as many symbols as possible into a small amount of screen space. To avoid strange errors, get used to inserting at least one empty line between different parts of your .Rmd document (i.e., between headings, text, and chunks).

  4. This is the footnote.