Listing 1: Download R code for book examples (not evaluated here)
Code
dir.create("R")
download.file("http://xcelab.net/rmpubs/sr2/code.txt", "R/code.R")
This is work in progress: At the moment I am working on the practice section of chapter 4, e.g., I have finished \(\approx 20\%\) of the book content.
This book collects personal notes during reading of Statistical Rethinking by Richard McElreath. I am using the second edition published 2020 by CRC Press an imprint of Routledge of the Taylor & Francis Group. Additionally I am using Statistical Rethinking 2023, the most recent set of free YouTube video lectures.
You can find links to other material on McElreath’s website about the book. Of special interest for me are the brms+tidyverse and the Stan+tidyverse conversion of his code. As I am not very experienced with R and completely new to Bayesian statistics and their tools this additional material is for me also very challenging. I am planning to read them simultaneously (section by section) and will dedicate parallel sections for their approaches. This has the advantage that the section numbers of the files conform to the section numbers of the second edition of the printed book.
WATCH OUT: This is my personal learning material and is therefore not an authoritative textbook!
I wrote this book as a text for others to read because that forces me to be become explicit and explain all my learning outcomes more carefully. Please keep in mind that this text is not written by an expert but by a learner. In spite of replicating most of the content it may contain many mistakes. All these misapprehensions and errors are my responsibility.
My text consists mostly of quotes from the first edition of Harris’ book. I converted my kindle book into a PDF file which I copied via the annotation system in Zotero into my Quarto files.
Example 1 : Quote
“Bayesian inference is really just counting and comparing of possibilities.” (McElreath, 2020, p. 20) (pdf)
@exm-preface-quote has links to my PDF and also to my annotation of the PDF. These links are a practical way for me to get the context of the quote. But as the linked PDF is saved locally at my hard disk these links do not work for you! (There is an option about Zotero groups to share files, but the PDF is not free to use and so I can’t offer this possibility.)
Often I made minor editing (e.g., shorting the text) or put the content in my own wording. In this case I couldn’t quote the text as it does not represent a specific annotation in my Zotero file. In this case I ended the paraphrase with (McElreath ibid.)
.
In any case most of the text in this Quarto book is not mine but coming from different resources (McElreath’ book or video lectures, Kurz’ website, R help files, packages vignettes, …). Most of the time I have put my own personal notes into a notes box as shown in Example 2.
Example 2 : Personal note
Note 1 : This is a personal note
In this kind of box I will write my personal thoughts and reflections. Usually this box will appear stand-alone (without the wrapping example box).
In any case I am the only responsible person for this text, especially if I have used code from the resources wrongly or misunderstood a quoted text passage.
Sections with the
Packages {rethinking} and {brms} have similar tasks. Therefore they share a lot of identical function name. Kurz has unloaded the {rethinking} package when it came to explain {brms} function and to prevent name conflicts. But this approach is not efficient for the structure of my documents where I have constantly changed between these two packages. So I have used the advice “Qualifying namespace” from the Google’s R Style Guide.
Whenever I used a function I called the function with the package name in front with the syntax <package name>::<function name>()
. Besides preventing conflicts with functions of identical names from different packages it helps to learn (or remember) which function belongs to which package. I think this justifies the small overhead and helps to make R code chunks self-sufficient. (No previous package loading, or library calls in the setup chunk.) To foster learning the relation between function and package I embrace the package name with curly brakes and format it in bold.
To prevent conflicts in chunk names, objects and variables I added the following suffix to the end of the name:
a
for the original book versionb
for the {tidyverse} / {brms} versionTo distinguish the models I used
Example 3 : Name of models
m4.3a
refers to the third {rethinking} model in the fourth chapter.m2.1b
refers to the first {tidyberse}/{brms} model in the second chapter.#| label: chap04-precis2-m4-1a
is the chunk label in the fourth chapter using the second version of the precis
summary for model m4.1a
.I am not using the exact code snippets for my replications because I am not only replicating the code to see how it works but also to change the values of parameters to observe their influences.
My focus is on learning Bayesian statistics. Therefore I have not replicated all code snippets from Kurz’ version in case they have no relation to Bayesian statistics but are just graphics explaining general procedures.
This is my first book using Quarto instead of bookdown I am using these notes therefore also to learn Quarto. As a result you will find sometimes remarks or call-out blocks to my Quarto experiences.
Go to the book website and download the R code examples for the book.
Listing 1: Download R code for book examples (not evaluated here)
dir.create("R")
download.file("http://xcelab.net/rmpubs/sr2/code.txt", "R/code.R")
The style of the code snippets is not the tidyverse style. For instance: The equal sign =
is not embedded between spaces or a list of variables, separated by comas has in front and before the coma a space.
I have converted the original code style with the RStudio addin {styler} package to tidyverse style: Assuming that the default value of the style transformer is styler::tidyverse_style()
I selected the code snippet I wanted to convert and called the addin which ran styler:::style_selection()
. See Example 4
To facilitate the comparison of {rethinking} and {tidyberse}/{brms} code I have used tabs. This has the disadvantage that one cannot jump directly to links under the tabs. In this case I have linked to the wrapping example and indicated the specific tab where the R code can be found. With graphic it is easier, because if you hover over the links you see the original graphic in a smaller overlay. This is very convenient for comparison of two different graphics (for instance the same graphic with {rethinking} versus {tidyverse} coding). Try it out and hover over Graph 2.
Example 4 : Comparison of code snippets in {rethinking} and {tidyverse} style
To give a better orientation inside RStudio I have R code snippets segmented as in the example above (“## R code 2.7 ##################”). In RStudio one can detect these lines easy as they are displayed as bold headers. This is very helpful for the navigation inside the Quarto file.
As copy & paste from the slides does not work I downloaded the PDF of the Speaker deck slides. But still, it didn’t work always. In that case I used TextSniper and formatted manually. But these copy & paste problems only arise when using new code, prepared for the 3rd edition. With the book (2nd ed.) I do not have problems to copy the code snippets via calibre with the ePUB eBook version.
In contrast to the sparse and partly outdated remarks in the book use the installation section from the rethinking
package at GitHub.
From the three steps I had already successfully installed the first one (rstan
and the C++
toolchain), so I had no need to follow the detailed instructions of the rstan
installation at https://mc-stan.org/users/interfaces/rstan.html.
To install the cmdstanr
package I visited https://mc-stan.org/cmdstanr/. This is an addition to my previous installation with the older version (2nd ed., 2022). As I installed the latest beta version of cmdstanr
the first time I also needed to compile the libraries with cmdstanr::install_cmdstan()
.
To check the result of my installation I ran check_cmdstan_toolchain()
.
Listing 2: Install the cmdstanr
package (not evaluated here)
install.packages("cmdstanr", repos = c("https://mc-stan.org/r-packages/", getOption("repos")))
cmdstanr::install_cmdstan()
cmdstanr::check_cmdstan_toolchain()
The command for downloaded cmdstanr
did not install the vignettes, which take a long time to build, but they are always available online at https://mc-stan.org/cmdstanr/articles/.
The vignette Getting started with CmdStanR also recommend to load the bayesplot
and posterior
packages, which are used later in the CmdStanR
-examples. But I believe these two packages are not necessary if you just plan to stick with the book.
Once the infrastructure is installed one can install the packages used by the book. With the exception of rethinking — the companion package of the book – they can all be downloaded from CRAN.
I had already devtools installed, therefore I deleted it from the list of installed packages.
Listing 3: Install packages (not evaluated here)
install.packages(c("coda","mvtnorm", "loo","dagitty","shape"))
devtools::install_github("rmcelreath/rethinking")
There are several websites with book solutions. They have different quality and not always exhaustive. For the purpose of comparison I have consulted mostly the following two collection of solutions:
I have also found two GitHub repos with solutions. The result of these solutions are not accessible online. One has to fork these repos and compile them to see the results.
WATCH OUT! Solutions are not authorized by the book author
These solutions are written by members of the #RStats community and are not authorized by Richard McElreath, the author of Statistical Rethinking.
Help appreciated!
If you find errors in this Quarto book or want to add some comment please do not hesitate to write issues or PRs on my GitHub site. I really appreciate it to learn from more experienced R users! It shortens the learning paths of self-directed learners.
The following tables matches the lectures (videos 2023 and slides 2023) with the book chapters of the second edition (2020). It was generated by a screenshot from Statistical Rethinking 2023 - 01 - The Golem of Prague (50:09), but can also be found as a slide in Statistical Rethinking 2023 - Lecture 01.
A better overview with links to videos and slides provides the following HTML table, taken from the README.md file for the 2023 lectures.
Links to videos and slides
Week ## | Meeting date | Reading | Lectures |
---|---|---|---|
Week 01 | 06 January | Chapters 1, 2 and 3 | [1] <Golem of Prague> <Slides> [2] <Garden of Forking Data> <Slides> |
Week 02 | 13 January | Chapter 4 | [3] <Geocentric Models> <Slides> [4] <Categories and Curves> <Slides> |
Week 03 | 20 January | Chapters 5 and 6 | [5] <Elemental Confounds> <Slides> [6] <Good and Bad Controls> <Slides> |
Week 04 | 27 January | Chapters 7,8,9 | [7] <Overfitting> <Slides> [8] <MCMC> <Slides> |
Week 05 | 03 February | Chapters 10 and 11 | [9] <Modeling Events> <Slides> [10] <Counts and Confounds> <Slides> |
Week 06 | 10 February | Chapters 11 and 12 | [11] <Ordered Categories> <Slides> [12] <Multilevel Models> <Slides> |
Week 07 | 17 February | Chapter 13 | [13] <Multilevel Adventures> <Slides> [14] <Correlated Features> <Slides> |
Week 08 | 24 February | Chapter 14 | [15] <Social Networks> <Slides> [16] <Gaussian Processes> <Slides> |
Week 09 | 03 March | Chapter 15 | [17] <Measurement> <Slides> [18] <Missing Data> <Slides> |
Week 10 | 10 March | Chapters 16 and 17 | [19] <Generalized Linear Madness> <Slides> [20] <Horoscopes> <Slides> |
Additional material
rethinking
package: Statistical Rethinking course and book package: https://github.com/rmcelreath/rethinking. I am using version 2.31.