2.2 Markdown extensions by bookdown

Although Pandoc’s Markdown is much richer than the original Markdown syntax, it still lacks a number of things that we may need for academic writing. For example, it supports math equations, but you cannot number and reference equations in multi-page HTML or EPUB output. We have provided a few Markdown extensions in bookdown to fill the gaps.

2.2.1 Number and reference equations

To number and refer to equations, put them in the equation environments and assign labels to them using the syntax (\#eq:label), e.g.,

\begin{equation} 
  f\left(k\right) = \binom{n}{k} p^k\left(1-p\right)^{n-k}
  (\#eq:binom)
\end{equation} 

It renders the equation below:

\[\begin{equation} f\left(k\right)=\binom{n}{k}p^k\left(1-p\right)^{n-k} \tag{2.1} \end{equation}\]

You may refer to it using \@ref(eq:binom), e.g., see Equation (2.1).

Equation labels must start with the prefix eq: in bookdown. All labels in bookdown must only contain alphanumeric characters, :, -, and/or /. Equation references work best for LaTeX/PDF output, and they are not well supported in Word output or e-books. For HTML output, bookdown can only number the equations with labels. Please make sure equations without labels are not numbered by either using the equation* environment or adding \nonumber or \notag to your equations. The same rules apply to other math environments, such as eqnarray, gather, align, and so on (e.g., you can use the align* environment).

We demonstrate a few more math equation environments below. Here is an unnumbered equation using the equation* environment:

\begin{equation*} 
\frac{d}{dx}\left( \int_{a}^{x} f(u)\,du\right)=f(x)
\end{equation*} 

\[\begin{equation*} \frac{d}{dx}\left( \int_{a}^{x} f(u)\,du\right)=f(x) \end{equation*}\]

Below is an align environment (2.2):

\begin{align} 
g(X_{n}) &= g(\theta)+g'({\tilde{\theta}})(X_{n}-\theta) \notag \\
\sqrt{n}[g(X_{n})-g(\theta)] &= g'\left({\tilde{\theta}}\right)
  \sqrt{n}[X_{n}-\theta ] (\#eq:align)
\end{align} 

\[\begin{align} g(X_{n}) &= g(\theta)+g'({\tilde{\theta}})(X_{n}-\theta) \notag \\ \sqrt{n}[g(X_{n})-g(\theta)] &= g'\left({\tilde{\theta}}\right) \sqrt{n}[X_{n}-\theta ] \tag{2.2} \end{align}\]

You can use the split environment inside equation so that all lines share the same number (2.3). By default, each line in the align environment will be assigned an equation number. We suppressed the number of the first line in the previous example using \notag. In this example, the whole split environment was assigned a single number.

\begin{equation} 
\begin{split}
\mathrm{Var}(\hat{\beta}) & =\mathrm{Var}((X'X)^{-1}X'y)\\
 & =(X'X)^{-1}X'\mathrm{Var}(y)((X'X)^{-1}X')'\\
 & =(X'X)^{-1}X'\mathrm{Var}(y)X(X'X)^{-1}\\
 & =(X'X)^{-1}X'\sigma^{2}IX(X'X)^{-1}\\
 & =(X'X)^{-1}\sigma^{2}
\end{split}
(\#eq:var-beta)
\end{equation} 

\[\begin{equation} \begin{split} \mathrm{Var}(\hat{\beta}) & =\mathrm{Var}((X'X)^{-1}X'y)\\ & =(X'X)^{-1}X'\mathrm{Var}(y)((X'X)^{-1}X')'\\ & =(X'X)^{-1}X'\mathrm{Var}(y)X(X'X)^{-1}\\ & =(X'X)^{-1}X'\sigma^{2}IX(X'X)^{-1}\\ & =(X'X)^{-1}\sigma^{2} \end{split} \tag{2.3} \end{equation}\]

2.2.2 Theorems and proofs

Theorems and proofs are commonly used in articles and books in mathematics. However, please do not be misled by the names: a “theorem” is just a numbered/labeled environment, and it does not have to be a mathematical theorem (e.g., it can be an example irrelevant to mathematics). Similarly, a “proof” is an unnumbered environment. In this section, we always use the general meanings of a “theorem” and “proof” unless explicitly stated.

In bookdown, the types of theorem environments supported are in Table 2.1. To write a theorem, you can use the syntax below:

::: {.theorem}
This is a `theorem` environment that can contain **any**
_Markdown_ syntax.
:::

This syntax is based on Pandoc’s fenced Div blocks and can already be used in any R Markdown document to write custom blocks. Bookdown only offers special handling for theorem and proof environments. Since this uses the syntax of Pandoc’s Markdown, you can write any valid Markdown text inside the block.

TABLE 2.1: Theorem environments in bookdown.
Environment Printed Name Label Prefix
theorem Theorem thm
lemma Lemma lem
corollary Corollary cor
proposition Proposition prp
conjecture Conjecture cnj
definition Definition def
example Example exm
exercise Exercise exr
hypothesis Hypothesis hyp

To write other theorem environments, replace ::: {.theorem} with other environment names in Table 2.1, e.g., ::: {.lemma}.

A theorem can have a name attribute so its name will be printed. For example,

::: {.theorem name="Pythagorean theorem"}
For a right triangle, if $c$ denotes the length of the hypotenuse
and $a$ and $b$ denote the lengths of the other two sides, we have
$$a^2 + b^2 = c^2$$
:::

If you want to refer to a theorem, you should label it. The label can be provided as an ID to the block of the form #label. For example,

::: {.theorem #foo}
A labeled theorem here.
:::

After you label a theorem, you can refer to it using the syntax \@ref(prefix:label). See the column Label Prefix in Table 2.1 for the value of prefix for each environment. For example, we have a labeled and named theorem below, and \@ref(thm:pyth) gives us its theorem number 2.1:

::: {.theorem #pyth name="Pythagorean theorem"}
For a right triangle, if $c$ denotes the length of the hypotenuse
and $a$ and $b$ denote the lengths of the other two sides, we have

$$a^2 + b^2 = c^2$$
:::

Theorem 2.1 (Pythagorean theorem) For a right triangle, if \(c\) denotes the length of the hypotenuse and \(a\) and \(b\) denote the lengths of the other two sides, we have

\[a^2 + b^2 = c^2\]

The proof environments currently supported are proof, remark, and solution. The syntax is similar to theorem environments, and proof environments can also be named using the name attribute. The only difference is that since they are unnumbered, you cannot reference them, even if you provide an ID to a proof environment.

We have tried to make all these theorem and proof environments work out of the box, no matter if your output is PDF or HTML. If you are a LaTeX or HTML expert, you may want to customize the style of these environments anyway (see Chapter 4). Customization in HTML is easy with CSS, and each environment is enclosed in <div></div> with the CSS class being the environment name, e.g., <div class="lemma"></div>. For LaTeX output, we have predefined the style to be definition for environments definition, example, exercise, and hypothesis, and remark for environments proof and remark. All other environments use the plain style. The style definition is done through the \theoremstyle{} command of the amsthm package. If you do not want the default theorem definitions to be automatically added by bookdown, you can set options(bookdown.theorem.preamble = FALSE). This can be useful, for example, to avoid conflicts in single documents (Section 3.4) using the output format bookdown::pdf_book with a base_format that has already included amsmath definitions.

Theorems are numbered by chapters by default. If there are no chapters in your document, they are numbered by sections instead. If the whole document is unnumbered (the output format option number_sections = FALSE), all theorems are numbered sequentially from 1, 2, …, N. LaTeX supports numbering one theorem environment after another, e.g., let theorems and lemmas share the same counter. This is not supported for HTML/EPUB output in bookdown. You can change the numbering scheme in the LaTeX preamble by defining your own theorem environments, e.g.,

\newtheorem{theorem}{Theorem}
\newtheorem{lemma}[theorem]{Lemma}

When bookdown detects \newtheorem{theorem} in your LaTeX preamble, it will not write out its default theorem definitions, which means you have to define all theorem environments by yourself. For the sake of simplicity and consistency, we do not recommend that you do this. It can be confusing when your Theorem 18 in PDF becomes Theorem 2.4 in HTML.

Below we show more examples4 of the theorem and proof environments, so you can see the default styles in bookdown.

Definition 2.1 The characteristic function of a random variable \(X\) is defined by

\[\varphi _{X}(t)=\operatorname {E} \left[e^{itX}\right], \; t\in\mathcal{R}\]

Example 2.1 We derive the characteristic function of \(X\sim U(0,1)\) with the probability density function \(f(x)=\mathbf{1}_{x \in [0,1]}\).

\[\begin{equation*} \begin{split} \varphi _{X}(t) &= \operatorname {E} \left[e^{itX}\right]\\ & =\int e^{itx}f(x)dx\\ & =\int_{0}^{1}e^{itx}dx\\ & =\int_{0}^{1}\left(\cos(tx)+i\sin(tx)\right)dx\\ & =\left.\left(\frac{\sin(tx)}{t}-i\frac{\cos(tx)}{t}\right)\right|_{0}^{1}\\ & =\frac{\sin(t)}{t}-i\left(\frac{\cos(t)-1}{t}\right)\\ & =\frac{i\sin(t)}{it}+\frac{\cos(t)-1}{it}\\ & =\frac{e^{it}-1}{it} \end{split} \end{equation*}\]

Note that we used the fact \(e^{ix}=\cos(x)+i\sin(x)\) twice.

Lemma 2.1 For any two random variables \(X_1\), \(X_2\), they both have the same probability distribution if and only if

\[\varphi _{X_1}(t)=\varphi _{X_2}(t)\]

Theorem 2.2 If \(X_1\), …, \(X_n\) are independent random variables, and \(a_1\), …, \(a_n\) are some constants, then the characteristic function of the linear combination \(S_n=\sum_{i=1}^na_iX_i\) is

\[\varphi _{S_{n}}(t)=\prod_{i=1}^n\varphi _{X_i}(a_{i}t)=\varphi _{X_{1}}(a_{1}t)\cdots \varphi _{X_{n}}(a_{n}t)\]

Proposition 2.1 The distribution of the sum of independent Poisson random variables \(X_i \sim \mathrm{Pois}(\lambda_i),\: i=1,2,\cdots,n\) is \(\mathrm{Pois}(\sum_{i=1}^n\lambda_i)\).

Proof. The characteristic function of \(X\sim\mathrm{Pois}(\lambda)\) is \(\varphi _{X}(t)=e^{\lambda (e^{it}-1)}\). Let \(P_n=\sum_{i=1}^nX_i\). We know from Theorem 2.2 that

\[\begin{equation*} \begin{split} \varphi _{P_{n}}(t) & =\prod_{i=1}^n\varphi _{X_i}(t) \\ & =\prod_{i=1}^n e^{\lambda_i (e^{it}-1)} \\ & = e^{\sum_{i=1}^n \lambda_i (e^{it}-1)} \end{split} \end{equation*}\]

This is the characteristic function of a Poisson random variable with the parameter \(\lambda=\sum_{i=1}^n \lambda_i\). From Lemma 2.1, we know the distribution of \(P_n\) is \(\mathrm{Pois}(\sum_{i=1}^n\lambda_i)\).

Remark. In some cases, it is very convenient and easy to figure out the distribution of the sum of independent random variables using characteristic functions.

Corollary 2.1 The characteristic function of the sum of two independent random variables \(X_1\) and \(X_2\) is the product of characteristic functions of \(X_1\) and \(X_2\), i.e.,

\[\varphi _{X_1+X_2}(t)=\varphi _{X_1}(t) \varphi _{X_2}(t)\]

Exercise 2.1 (Characteristic Function of the Sample Mean) Let \(\bar{X}=\sum_{i=1}^n \frac{1}{n} X_i\) be the sample mean of \(n\) independent and identically distributed random variables, each with characteristic function \(\varphi _{X}\). Compute the characteristic function of \(\bar{X}\).

Solution. Applying Theorem 2.2, we have

\[\varphi _{\bar{X}}(t)=\prod_{i=1}^n \varphi _{X_i}\left(\frac{t}{n}\right)=\left[\varphi _{X}\left(\frac{t}{n}\right)\right]^n.\]

Hypothesis 2.1 (Riemann hypothesis) The Riemann Zeta-function is defined as \[\zeta(s) = \sum_{n=1}^{\infty} \frac{1}{n^s}\] for complex values of \(s\) and which converges when the real part of \(s\) is greater than 1. The Riemann hypothesis is that the Riemann zeta function has its zeros only at the negative even integers and complex numbers with real part \(1/2\).

2.2.2.1 A note on the old syntax

For older versions of bookdown (before v0.21), a theorem environment could be written like this:

```{theorem pyth, name="Pythagorean theorem"}
For a right triangle, if $c$ denotes the length of the hypotenuse
and $a$ and $b$ denote the lengths of the other two sides, we have

$$a^2 + b^2 = c^2$$
```

This syntax still works, but we do not recommend it since the new syntax allows you to write richer content and has a cleaner implementation.

This conversion between the two syntaxes is straightforward. The above theorem could be rewritten in this way:

::: {.theorem #pyth name="Pythagorean theorem"}
For a right triangle, if $c$ denotes the length of the hypotenuse
and $a$ and $b$ denote the lengths of the other two sides, we have

$$a^2 + b^2 = c^2$$
:::

You can use the helper function bookdown::fence_theorems() to convert a whole file or a piece of text. This is a one-time operation. We have tried to do the conversion from old to new syntax safely, but we might have missed some edge cases. To make sure you do not overwrite the input file by accident, you can write the converted source to a new file, e.g.,

bookdown::fence_theorems("01-intro.Rmd", output = "01-intro-new.Rmd")

Then double check the content of 01-intro-new.Rmd. Using output = NULL will print the result of conversion in the R console, and is another way to check the conversion. If you are using a control version tool, you can set output to be the same as input, as it should be safe and easy for you to revert the change if anything goes wrong.

2.2.3 Special headers

There are a few special types of first-level headers that will be processed differently in bookdown. The first type is an unnumbered header that starts with the token (PART). This kind of headers are translated to part titles. If you are familiar with LaTeX, this basically means \part{}. When your book has a large number of chapters, you may want to organize them into parts, e.g.,

# (PART) Part I {-} 

# Chapter One

# Chapter Two

# (PART) Part II {-} 

# Chapter Three

A part title should be written right before the first chapter title in this part, both title in the same document. You can use (PART\*) (the backslash before * is required) instead of (PART) if a part title should not be numbered.

The second type is an unnumbered header that starts with (APPENDIX), indicating that all chapters after this header are appendices, e.g.,

# Chapter One 

# Chapter Two

# (APPENDIX) Appendix {-} 

# Appendix A

# Appendix B

The numbering style of appendices will be automatically changed in LaTeX/PDF and HTML output (usually in the form A, A.1, A.2, B, B.1, …). This feature is not available to e-books or Word output.

2.2.4 Text references

You can assign some text to a label and reference the text using the label elsewhere in your document. This can be particularly useful for long figure/table captions (Section 2.4 and 2.5), in which case you normally will have to write the whole character string in the chunk header (e.g., fig.cap = "A long long figure caption.") or your R code (e.g., kable(caption = "A long long table caption.")). It is also useful when these captions contain special HTML or LaTeX characters, e.g., if the figure caption contains an underscore, it works in the HTML output but may not work in LaTeX output because the underscore must be escaped in LaTeX.

The syntax for a text reference is (ref:label) text, where label is a unique label5 throughout the document for text. It must be in a separate paragraph with empty lines above and below it. The paragraph must not be wrapped into multiple lines, and should not end with a white space. For example,

(ref:foo) Define a text reference **here**. 

Then you can use (ref:foo) in your figure/table captions. The text can contain anything that Markdown supports, as long as it is one single paragraph. Here is a complete example:

A normal paragraph.

(ref:foo) A scatterplot of the data `cars` using **base** R graphics. 

```{r foo, fig.cap='(ref:foo)'}
plot(cars)  # a scatterplot
```

Text references can be used anywhere in the document (not limited to figure captions). It can also be useful if you want to reuse a fragment of text in multiple places.


  1. Some examples are adapted from the Wikipedia page https://en.wikipedia.org/wiki/Characteristic_function_(probability_theory)↩︎

  2. You may consider using the code chunk labels.↩︎