Rethinking Companion
1
The Golem of Prague
1.1
Statistical golems
1.2
Statistical Rethinking
1.2.1
What are we trying to do with the golems?
1.3
Tools for golem engineering
1.3.1
Bayesian data analysis
1.3.2
Model comparison and predictions
1.3.3
Multilevel models
1.3.4
Graphical causal models
1.4
Summary
Session Info
2
Small worlds and large worlds
2.1
The garden of forking data
2.1.1
Counting possibilities
2.1.2
Combining other information
2.1.3
From counts to probability
2.2
Building a model
2.2.1
A data story
2.2.2
Bayesian updating
2.2.3
Evaluate
2.3
Components of the model
2.3.1
Variables
2.3.2
Definitions
2.3.3
A model is born
2.4
Maiking the model go
2.4.1
Quadratic
2.4.2
MCMC
3
Sampling the imaginary
3.0.1
Probabilities vs. Frequency counts
3.1
Sampling from a grid-approximate posterior
3.2
Sampling to summarize
3.2.1
Intervals of defined boundaries
3.2.2
intervals of defined mass
3.2.3
Point estimates
3.3
Sampling to simulate prediction
3.3.1
Dummy data
3.3.2
Model checking
3.3.3
Practice with brms
4
Geocentric models
4.1
Why normal distributions are normal
4.1.1
Normal by addition
4.1.2
Normal by multiplication
4.1.3
Normal by log-multiplication
4.1.4
Using Gaussian distributions
4.2
A language for describing models
4.2.1
Re-describing the globe tossing model.
4.3
Gaussian model of height
4.3.1
The data
4.3.2
The Model
4.3.3
Grid approximation of the posterior distribution
4.3.4
Sampling from the posterior
4.3.5
Finding the posterior distribution with
quap
4.3.6
Sampling with
quap
4.4
Linear Prediction
4.4.1
Finding the posterior distribtuion
4.4.2
Interpreting the posterior distribution
4.5
Curves from lines
4.5.1
Polynomial regression
4.5.2
Splines
5
The many variables & the spurious waffles
5.1
Spurious assoiciation
5.1.1
Think before to regress
5.1.2
Testable implications
5.1.3
Multiple regression notation
5.1.4
Approximating the posterior
5.1.5
Plotting multivariate posteriors
5.2
Masked relationship
5.3
Categorical variables
5.3.1
Binary categories
5.3.2
Many categories
6
The haunted dag & the casual terror
6.1
Multicollinearity
6.1.1
Example: Try to predict height from length of a person’s legs.
6.1.2
Multicollinear milk
6.2
Post-treatrment bias
6.2.1
A prior is born
6.2.2
Blocked by consequence
6.2.3
Fungus and d-separation
6.3
Collider bias
6.3.1
Collider for false sorrow
6.3.2
The haunted DAG
6.4
Confronting confounding
6.4.1
Shutting the backdoor
6.4.2
Two roads
6.4.3
Backdoor waffles
7
Ulysses’ compass
7.1
The problem with parameters
7.1.1
More parameters (almost) always improve fit
7.1.2
Too few parameters hurts, too
7.2
Entropy and accuracy
7.2.1
Firing the weatherperson
7.2.2
Information and uncertainty
7.2.3
From entropy to accuracy
7.2.4
Estimating divergence
7.2.5
Scoring the right data
7.3
Golem taming: regularization
7.4
Predicting predictive accuracy
7.4.1
Cross-validation
7.4.2
Information criteria
7.4.3
Comparing CV, PSIS, and WAIC
7.5
Model comparison
7.5.1
Model mis-selection
7.5.2
Outliers and other illustions
8
Conditional manatees
8.1
Building an interaction
8.1.1
Making a rugged model
8.1.2
Adding an indicator isn’t enough
8.1.3
Adding an interaction does work
8.1.4
Plotting the interaction
8.2
Symmetry of interactions
8.3
Continuous interactions
8.3.1
A winter flower
8.3.2
the models
8.3.3
Plotting posterior predictions
8.3.4
Plotting prior predictions
9
Markov Chain Monte Carlo
9.1
Good King Markov and his island kingdom
9.2
Metropolis algorithims
9.2.1
Gibbs sampling
9.2.2
High-dimensional problems
9.3
Hamiltonian Monte Carlo
9.3.1
Another parable
9.3.2
Particles in space
9.3.3
Limitations
9.4
Easy HMC:
ulam
(
brm
)
9.4.1
Preperation
9.4.2
Sampling from the posterior
9.4.3
Sampling again, in parallel
9.4.4
Visualization
9.4.5
Checking the chain
9.5
Care and feeding of your Markov chain
9.5.1
How many samples do you need?
9.5.2
How many chains do you need?
9.5.3
Taming a wild chain
9.5.4
Non-identifiable parameters
10
Big entropy and the generalized linear model
10.1
Maximum entropy
10.1.1
Gaussian
10.1.2
Binomial
10.2
Generalized linear models
10.2.1
Meet the family
10.2.2
Linking linear models to distributions
10.2.3
Omitted variable bias (again)
10.2.4
Absolute relative differences
10.2.5
GLMs and information criteria
11
God spiked the integers
11.1
Binomial regression
11.1.1
Logistic regression : Prosocial chimpanzees
11.1.2
Relative shark and absolute deer
11.1.3
Aggregated binomial: Chimps condensed
11.1.4
Aggregated admissions
12
Monsters and mixtures
13
Models with memory
14
Adventures in covariance
15
Missing data and other opportunities
16
Generalized linear madness
17
Horoscopes
References
Published with bookdown
Rethinking Companion
Chapter 13
Models with memory