Chapter 2 This is level 1 heading (main heading)
2.1 This is level 2 heading (subheading)
2.1.1 This is level 3 heading
2.1.1.1 And so on…
- Single dash and space after a blank line creates an unordered list (bulletpoints).
- Lists can be nested by indenting using four spaces (or two Tabs).
- Deeper and deeper…
- Lists can be nested by indenting using four spaces (or two Tabs).
- This is italics.
- This is also italics.
- This is boldface.
- This is also boldface (two underscores).
- This is
strikethrough. - This is
monospace
(backticks are usually right above Tab).
- This is an ordered list
- It is numbered
- Nested elements cannot be ordered. 2.1 You can try but it will look like this.
- If you don’t want to worry about numbering lists manually.
- You can create them like this
- Easy, isn’t it?
A single line break doesn’t do anything.
More than one line break…
…creates a new paragraph.
The \ (there is an important blank space after it!) is a non-bkreaing space; any two words separated by it will be forced to stay on the same line and will not be split over two lines.
Here, we use it to create a gap between paragraphs by inserting two empty lines.
And most importantly for the Rmd-R
integration:
# This is an R code chunk
# Here you can write code and R will run it when you generate your document
# and display the output below
6 * 7
## [1] 42
Any R
code can also be evaluated in-line like this: 2 + 3 = 5.
Feel free to take a moment to make sure you understand the relationship between the R Markdown notation and the resulting output. For a quick reference guide to Rmd, see this cool cheat sheet.
2.2 Rmd read-along
OK, now that you know the very basics, let’s look at the .Rmd file step-by-step.
The first thing to realise is that an .Rmd file is just a plain text file (such as .txt). You could open it in Notepad, MS Word, or OpenOffice4 and would basically see the same thing as in R Studio. The only reason for the special .Rmd extension is for R Studio to know to put all the nice colours in to aid readability and offer you options associated with R Markdown, such as the option to actually generate a document from the file. So don’t go away thinking there’s some magic going on here: There are just text files.
With that out of the way, keep reading on the document in your browser but let’s scroll all the way up in the .Rmd file. There, you can see this header:
---
title: "Introducing R Markdown"
author: "dapR 1 -- Lab 2"
output:
html_notebook:
theme: flatly
code_folding: show
---
For reasons you don’t need to worry about, this header is written in a different markup language called YAML (Yet Another Markup Language – no kiddin’!). Here, you provide the title of the document, the output format, and many other general options.
In our document, we set the title and author and define the output to be an R notebook. R notebook is a HTML5 file just like most websites, which is why we can easily put it online like this. The neat feature of R notebooks is their ability to show/hide and evaluate code chunks and the fact that you can easily download and edit them in R Studio. That is why we will be using them in our dapR labs.
The theme
parameter indented under html_notebook
specifies what the document looks like.
While you can customise the aesthetics of your documents to your heart’s delight, some nice and smart people have provided us with several basic themes that, in our view, look pretty neat.
Finally, the code_folding
parameter governs whether the code chunks should be shown or hidden by default.
While there’s a host of options you can play around with, it is a good idea to always include at least the title and output.
OK, next, there are two code chunks.
The first one gets generated automatically by R Studio when you create a new .Rmd file (more on that later) and is there to set a very basic default “code chunk option” echo=TRUE
.
This option tells R Studio to create the ouput file with the code chunks visible.
Changing it to echo=FALSE
will create a document with code not displayed.
You can specify other default options if you wish but that’s a bit of an advanced topic.
Notice two further things about the chunk:
- It is named (
setup
) – This doesn’t really do anything but it can be helpful when diagnosting code errors and it’s kind of tidy. - There are further chunk options; in this case
include=FALSE
. This particular option makes the code chunk get evaluated but shows neither the code nor its output in the final document. In other words, it executes the code quietly in the background.
Taken together, the last two paragraphs mean that there are two ways of setting code chunk options:
- Globally – Just like the code inside of the first chunk does. Once set like this, the options will apply to all subsequent code chunks.
- Locally – Inside the
{r, ...}
bit at the top of each chunk. These options will apply only to the given code chunk.
There are, again, lots of useful options you can set and, using local options, you can change the behaviour of each individual chunk regardless of what the default—global—setting is. A comprehensive and by no means necessary list can be found in this R markdown reference guide.
The second code chunk illustrates this rather nicely. Despite setting echo
to TRUE
in global options in the first chunk, the second one sets it to FALSE
.
This means that, for this chunk only, the code will get executed and its output displayed but the code chunk itself will not show up in the final document.
However, as it happents, the code in this chunk doesn’t have any output so, in this case echo=FALSE
is indistinguishable from include=FALSE
.
To see the difference, have a look at this chunk:
## Here, output gets included in the document but the code does not!
With respect to the actual contents of the second chunk, don’t worry about it too much. You are not supposed to understand at this stage. If you’re really curious though, the code creates a function that puts the “Task X:” before the actual wording of the tasks so that we don’t have to type it all out and worry about which number this particular task is. We’re lazy like that, you see…
The rest of the .Rmd file should be fairly readable, especially with the benefit of knowing the markdown syntax for text formatting we talked about above. Remember that, by comparing the .Rmd with the lab sheet, you can always figure out how to do things you haven’t explicitly been taught (e.g., writing in superscript or in subscript).
Perhaps the only slightly puzzling looking bits are the links to other websites. It is not immediately important for you to know how to include these links (AKA hyperlinks, or URLs) so feel free to skip the next section.
2.3 A short aside on URLs in R Markdown
The anatomy of URL markdown (again, that’s links to you and me) is pretty straightforward. If you want to display the actual URL and make it “clickable”, put the address inside < >:
https://www.some.site (not an acutal website)
If you want to link to a website using custom text, this is how you do it (the bit in quotes is optional):
text you want to make “clickable” (also doesn’t work)
By default, links open in the same tab which can be annoying. To make a link open in a new tab, add {target="_blank"}
after the ()s:
Actual example (opens in new tab) – hover over this with your mouse for a moment to see the mouseover info appear.
2.4 Code chunks
Let’s talk a little more about code chunks (and in-line code), since they are the main reason why Rmd is so useful when it comes to reports of statistical analysis. For one, they are great for creating tables and figures. As a basic demonstration, we can create a simple histogram. Again, at this point, you don’t have to worry about understainding the code itselt. The important bit is that, once you know how to create fancy plots and tables, you can create them directly in your .Rmd file to put them in your paper/report/presentation:
library(ggplot2) # load the ggplot2 package
qplot(rnorm(1000), xlab = "Value", ylab = "Frequency") # basic quick histogram
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
That’s pretty cool, isn’t it? What’s arguable even cooler is the fact, that you can incorporate code in the actual body text. Let’s say we have a chunk of code that runs some analysis, for example takes the mean age of our sample.
# create a made up sequence of numbers and pretend they are the ages of our participants
age <- c(34, 22, 26, 25, 43,19, 19, 20, 33, 27, 27, 26, 54)
# calculate their mean, rounded to 2 decimal places
mean_age <- round(mean(age), digits = 2)
With Rmd, we don’t really even have to know what the value of the mean is when writing the results. We can simply use in-line code to have R Studio generate a document that say that the mean age was 28.85.
For the time being, don’t worry about how this is actually done. We will cover that later in depth. For now, simply rejoice in the fact that it can be done ;).
This feature has a very useful consequence: You can write a document in such a way that, if something about your data or analysis changes, you can simply edit the code in the appropriate chunks, re-generate the output file and all the values will get updated. Imagine having to redo a table of 40, 50, 100 numbers – that’s an awfully teatious task and it’s prone to human error. With a proper use of R Markdown you will never have to do it! Imagine how many hours of work that will save you (trust us, it’s a lot). How amazing is that?
2.5 Generating documents
Now that you have an understanding of the basics of Rmd along with some nifty tricks and can read the source file, let’s talk about how to generate output from the .Rmd’s.
The simplest way of turning the source into output is using the pre-defined shortcuts.
Task 4: Press Ctrl + ⇧ Shift + K (Windows/Linux) or ⌘ Command + ⇧ Shift + K (Mac OS) to turn generate a HTML version of this document.
Hopefully, nothing happend and maybe you spotter R
giving you an error statement of some sort written all in red!
The reason for this is that, before we generate the file, we need to “run” all the code chunks so that R studio has access to their output.
There are several ways of doing this but the easiest is, once again, with a shortcut.
Task 5: Press Ctrl + Alt + R (Windows/Linux) or ⌘ Command + Alt + R (Mac OS) to run all chunks in this .Rmd file.
Task 6: Wait a few seconds for R
to execute your command and then try creating the HTML document again.
The first time you generate a document like this, it can take a while for R
to install and run all the tools necessary to produce your output.
After a moment, the result should pop out in R Studio’s internal viewer.
Take a minute to marvel at your creation!
…
OK, that’s plenty now! Close the viewer window and check your “Week_02” folder. Therein, you should find a file called “Week02_Rmd_intro.nb.html” (the .nb bit indicates it’s an R notebook file). This is your actual output. If you open it, it should appear in your default web browser because HTML files are the stuff websites are made from.
Next, let’s test the editability feature we have so lauded above! Check the value of the mean of the age variable. In the original file, it should be 28.85.
Task 7: Try changing some numbers in the age
variable in the corresponding code chunk, re-run all chunks, and re-generate the file to convince yourself that the mean age will get updated automatically.
Lo and behold, the value is still 28.85… (seriously, change it to something else!)
Now, let’s imagine you don’t want a HTML file but a .doc (Word document).
In order to get that, you need to change the YAML header so that it reads exactly output: word_document
.
Task 8: Generate a Word document from your .Rmd file.
If you don’t have MS Office installed on your computer but are using OpenOffice, change the header to output: odt_document
.
Task 9: For your final task, get your notes from last week’s tutorial and turn them into a nice document written using R Markdown and render it as PDF, R Notebook, or Word (OpenOffice) document.
Well done!
That is all we have in store for you for this lab. We suggest you go over what you learnt today to help your newly acquired knowledge settle.
See you next week!