1 What’s in This Book (Read This First!)

1.1 Real people can read this book

Kruschke began his text with “This book explains how to actually do Bayesian data analysis, by real people (like you), for realistic data (like yours).” Agreed. Similarly, this project is designed to help those real people do Bayesian data analysis. While I’m at it, I may as well explicate my assumptions about you.

If you’re looking at this project, I’m guessing you’re a graduate student, a post-graduate academic or researcher of some sort. Which means I’m presuming you have at least a 101-level foundation in statistics. In his text, it seems like Kruschke presumed his readers would have a good foundation in calculus, too. I make no such presumption. But if your stats 101 chops are rusty, check out Legler and Roback’s free bookdown text, Broadening Your Statistical Horizons or Navarro’s free text, Learning statistics with R: A tutorial for psychology students and other beginners.

I presume a basic working fluency in R and a vague idea about what the tidyverse is. Kruschke does some R warm-up in Chapter 3, and I follow suit. But if you’re totally new to R, you might also consider starting with Peng’s R Programming for Data Science. The best introduction to the tidyvese-style of data analysis I’ve found is Grolemund and Wickham’s R for Data Science.

1.2 What’s in this book

This project is not meant to stand alone. It’s a supplement to the second edition of Kruschke’s Doing Bayesian Data Analysis. I follow the structure of his text, chapter by chapter, translating his analyses into brms and tidyverse code. However, many of the sections in the text are composed entirely of equations and prose, leaving us nothing to translate. When we run into those sections, the corresponding sections in this project might be blank or even missing.

Also beware the content herein will depart at times from the source material. Bayesian data analysis with HMC is an active area of development in terms of both statistical methods and software implementation. There will also be times when my thoughts and preferences on Bayesian data analysis diverge a bit from Kruschke’s. In those places of divergence, I will often provide references and explanations.

In this project, I use a handful of formatting conventions gleaned from R4DS and R Markdown: The Definitive Guide.

  • I put R and R packages (e.g., brms) in boldface.
  • R code blocks and their output appear in a gray background. E.g.,
## [1] 4
  • Did you notice how there were two strips of gray background, there? The first one designated the actual code. The second one was the output of that code. The output of a code block often begins with ##.
  • Functions are in a typewriter font and followed by parentheses, all atop a gray background (e.g., brm()).
  • When I want to make explicit what packages a given function comes from, I insert the double-colon operator :: between the package name and the function (e.g., tidyr::pivot_longer()).
  • R objects, such as data or function arguments, are in typewriter font atop a gray background (e.g., d or size = 2).
  • Hyperlinks are denoted by their typical blue-colored font.

1.3 What’s new in the second edition

This is my first attempt at this project. There’s nothing new from my end.

1.4 Gimme feedback (be polite)

I am not a statistician and I have no formal background in computer science. I just finished my PhD in clinical psychology in 2018. During my graduate training I developed an unexpected interest in applied statistics and, more recently, programming. I became an R user in 2015 and started learning about Bayesian statistics around 2013. There is still so much to learn, so my apologies for when my code appears dated or inelegant. There will also be occasions in which I’m not yet sure how to reproduce models or plots in the text. Which is all to say, suggestions on how to improve my code are welcome. If you’d like to learn more about me, you can find my website here.

1.5 Thank you!

While in grad school, I benefitted tremendously from free online content. This project and others like it (e.g., here or here or here) are my attempts to pay it forward. As soon as you’ve gained a little proficiency, do consider doing to same.

I addition to great texts like Kruschke’s, I’d like to point out a few other important resources that have allowed me to complete a project like this:

If you haven’t already, bookmark these resources and share them with your friends.

Session info

At the end of every chapter, I use the sessionInfo() function to help make my results more reproducible.

## R version 3.6.0 (2019-04-26)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS High Sierra 10.13.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] compiler_3.6.0  magrittr_1.5    tools_3.6.0     htmltools_0.4.0
##  [5] yaml_2.2.0      Rcpp_1.0.2      stringi_1.4.3   rmarkdown_1.13 
##  [9] knitr_1.23      stringr_1.4.0   xfun_0.10       digest_0.6.21  
## [13] rlang_0.4.1     evaluate_0.14