17  Introduction to Multi-Parameter models

Example 17.1 Assume body temperatures (degrees Fahrenheit) of healthy adults follow a Normal distribution with unknown mean \(\mu\) and unknown standard deviation \(\sigma\). Suppose we wish to estimate both \(\mu\), the population mean healthy human body temperature, and \(\sigma\), the population standard deviation of body temperatures. Assume a discrete prior distribution according to which

  • \(\mu\) takes values 97.6, 98.1, 98.6 with prior probability 0.2, 0.3, 0.5, respectively.
  • \(\sigma\) takes values 0.5, 1 with prior probability 0.25, 0.75, respectively.
  • \(\mu\) and \(\sigma\) are independent.
  1. Start to construct the Bayes table. What are the possible values of the parameter (in this discrete example)? What are the prior probabilities? (Hint: the parameter \(\theta\) is a pair \((\mu, \sigma)\).)




  2. Suppose two temperatures of 97.5 and 97.9 are observed, independently. Identify the likelihood.




  3. Complete the Bayes table and find the posterior distribution after observing these two measurements. Compare to the prior distribution.




  4. The prior assumes that \(\mu\) and \(\sigma\) are independent. Are they independent according to the posterior distribution?




Example 17.2 Continuing Example 17.1, let’s assume a more reasonable, continuous prior for \((\mu,\sigma)\).

  1. What are the possible values of \(\sigma\)? How would you choose a prior for \(\sigma\)?




  2. For a prior, assume that: \(\mu\) has a Normal distribution with mean 98.6 and standard deviation 0.3, \(\sigma\) has a “Half-Normal(0, 1)” distribution (that’s Normal(0, 1) but constrained to positive values), and \(\mu\) and \(\sigma\) are independent. Simulate \((\mu, \sigma)\) pairs from the prior distribution and plot them.




  3. Simulate the prior predictive distribution. Do the values seem reasonable? If not, adjust the prior distribution of parameters and try again.




  4. Find and interpret a central 98% prior credible interval for \(\mu\).




  5. Find and interpret a central 98% prior credible interval for \(\sigma\).




  6. What is the prior credibility that both \(\mu\) and \(\sigma\) lie within their credible intervals?




Example 17.3 Continuing Example 17.1, we’ll now use grid approximation to approximate the posterior distribution.

  1. Assume a grid of \(\mu\) values from 96 to 100 in increments of 0.01, and a grid of \(\sigma\) values from 0.01 to 3 in increments of 0.01. How many possible values of the pair \((\mu, \sigma)\) are there; that is, how many rows are there in the Bayes table?




  2. Suppose first that we are given a sample of two measurements of 97.9 and 97.5. Use grid approximation to approximate the joint posterior distribution of \((\mu, \sigma)\) Simulate values from the joint posterior distribution and plot them. Compare to the prior. Compute the posterior correlation between \(\mu\) and \(\sigma\); are they independent according to the posterior distribution?




  3. In a recent study, the sample mean body temperature in a sample of 208 healthy adults was 97.7 degrees F and the sample standard deviation was 1.0 degree F; the sample data is summarized in Section 17.1.5 (it’s the same data we used in Section 14.1). Use grid approximation to approximate the joint posterior distribution of \((\mu, \sigma)\) Simulate values from the joint posterior distribution and plot them. Compare to the prior. Compute the posterior correlation between \(\mu\) and \(\sigma\); are they independent according to the posterior distribution?




Example 17.4 Continuing Example 17.1, we’ll now use simulation to approximate the posterior distribution, given the same sample data as before.

  1. Describe how, in principle, you would use a naive (non-MCMC) simulation to approximate the posterior distribution. In practice, what is the problem with such a simulation?




  2. Use brms to approximate the posterior distribution. Compare to the results of the grid approximation.




  3. Find and interpret a central 98% posterior credible interval for \(\mu\).




  4. Find and interpret a central 98% posterior credible interval for \(\sigma\).




  5. What is the posterior credibility that both \(\mu\) and \(\sigma\) lie within their credible intervals?




Example 17.5 Continuing the previous example

  1. Describe how you could use simulation to approximate the posterior predictive distribution.




  2. Run the simulation, summarize the predictive distribution, and find and interpret a 95% posterior predictive interval.




  3. Explain how you could use posterior predictive checking to assess the reasonableness of the model in light of the observed data. Does the model seem reasonable?




17.1 Notes

17.1.1 Discrete grid

mu = c(97.6, 98.1, 98.6)
sigma = c(0.5, 1)

# theta is all possible (mu, sigma) pairs
theta = expand.grid(mu, sigma)
names(theta) = c("mu", "sigma")

# prior
prior_mu = c(0.20, 0.30, 0.50)
prior_sigma = c(0.25, 0.75)
prior = apply(expand.grid(prior_mu, prior_sigma), 1, prod)
prior = prior / sum(prior)

# data
y = c(97.9, 97.5) # single observed value

# likelihood
likelihood = dnorm(97.9, mean = theta$mu, sd = theta$sigma) *
  dnorm(97.5, mean = theta$mu, sd = theta$sigma)

# posterior
product = likelihood * prior
posterior = product / sum(product)

# bayes table
bayes_table = data.frame(theta,
                     prior,
                     likelihood,
                     product,
                     posterior)

# display table
bayes_table |>
  adorn_totals("row",,,, -c(mu, sigma)) |>
  kbl(digits = 4) |>
  kable_styling()
mu sigma prior likelihood product posterior
97.6 0.5 0.050 0.5212 0.0261 0.2041
98.1 0.5 0.075 0.2861 0.0215 0.1680
98.6 0.5 0.125 0.0212 0.0027 0.0208
97.6 1 0.150 0.1514 0.0227 0.1778
98.1 1 0.225 0.1303 0.0293 0.2296
98.6 1 0.375 0.0680 0.0255 0.1997
Total - 1.000 1.1782 0.1277 1.0000
ggplot(bayes_table |>
         mutate(mu = factor(mu),
                sigma = factor(sigma)),
       aes(mu, sigma)) +
  geom_tile(aes(fill = prior))  +
  scale_fill_viridis(option = "mako",
                     limits = c(0, max(c(prior, posterior)))) +
  theme_bw()

ggplot(bayes_table |>
         mutate(mu = factor(mu),
                sigma = factor(sigma)),
       aes(mu, sigma)) +
  geom_tile(aes(fill = posterior)) +
  scale_fill_viridis(option = "mako",
                     limits = c(0, max(c(prior, posterior))))
(a) Prior distribution.
(b) Posterior distribution.
Figure 17.1: Joint distribution of mean and SD of body temperatures in Example 17.1.

17.1.2 Prior distribution

n_rep = 10000

mu_sim_prior = rnorm(n_rep, 98.6, 0.3)
sigma_sim_prior = abs(rnorm(n_rep, 0, 1))
sim_prior = data.frame(mu_sim_prior, sigma_sim_prior)

ggplot(sim_prior,
       aes(x = mu_sim_prior,
           y = sigma_sim_prior)) +
  geom_point(color = bayes_col["prior"], alpha = 0.4) +
  labs(x = "mean (mu)",
       y = "sd (sigma)") +
  theme_bw()

ggplot(sim_prior,
       aes(x = mu_sim_prior,
           y = sigma_sim_prior)) +
  stat_density_2d(aes(fill = after_stat(level)),
                  geom = "polygon", color = "white") +
  scale_fill_viridis(option = "mako") +
  labs(x = "mean (mu)",
       y = "sd (sigma)") +
  theme_bw()
(a) Scatterplot.
(b) Density plot.
Figure 17.2: Simulated joint prior distribution of mean and SD of body temperatures in Example 17.2.

17.1.3 Prior predictive distribution

# given theta = (mu, sigma), simulation y from Normal(mu, sigma)
y_sim_prior = rnorm(n_rep, mu_sim_prior, sigma_sim_prior)

# plot the simulated y values
ggplot(data.frame(y_sim_prior),
       aes(x = y_sim_prior)) + 
  geom_histogram(aes(y = after_stat(density)),
                 col = bayes_col["prior_predict"],
                 fill = "white") +
  geom_density(col = bayes_col["prior_predict"],
               linewidth = 1) +
  labs(x = "Body temperature (y)") +
  theme_bw()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

# plot the cdf
ggplot(data.frame(y_sim_prior),
       aes(x = y_sim_prior)) + 
  stat_ecdf(col = bayes_col["prior_predict"],
            linewidth = 1) +
  labs(x = "Body temperature (y)",
       y = "cdf") +
  theme_bw()

quantile(y_sim_prior, c(0.025, 0.975))
     2.5%     97.5% 
 96.35845 100.74519 

17.1.4 Grid approximation: \(n=2\)

mu = seq(96, 100, 0.01)
sigma = seq(0.01, 3, 0.01)

# theta is all possible (mu, sigma) pairs
theta = expand.grid(mu, sigma)
names(theta) = c("mu", "sigma")

# prior
prior_mu = dnorm(mu, 98.6, 0.3)
prior_sigma = dnorm(sigma, 0, 1)
prior = apply(expand.grid(prior_mu, prior_sigma), 1, prod)
prior = prior / sum(prior)

# data
y = c(97.9, 97.5) # single observed value

# likelihood
likelihood = dnorm(97.9, mean = theta$mu, sd = theta$sigma) *
  dnorm(97.5, mean = theta$mu, sd = theta$sigma)

# posterior
product = likelihood * prior
posterior = product / sum(product)
n_rep = 10000

# sample (mu, sigma) pairs for the posterior distribution by
# sampling rows of the Bayes table according to posterior
# and then finding the (mu, sigma) values for the sampled rows

index_sim = sample(1:length(posterior),
                   size = n_rep, replace = TRUE,
                   prob = posterior)

sim_posterior = theta |> slice(index_sim)

ggplot(sim_posterior,
       aes(x = mu,
           y = sigma)) +
  geom_point(color = bayes_col["posterior"], alpha = 0.4) +
  labs(x = "mean (mu)",
       y = "sd (sigma)") +
  theme_bw()

ggplot(sim_posterior,
       aes(x = mu,
           y = sigma)) +
  stat_density_2d(aes(fill = after_stat(level)),
                  geom = "polygon", color = "white") +
  scale_fill_viridis(option = "mako") +
  labs(x = "mean (mu)",
       y = "sd (sigma)") +
  theme_bw()
(a) Scatterplot.
(b) Density plot.
Figure 17.3: Grid-approximated joint posterior distribution of mean and SD of body temperatures in Example 17.2.
cor(sim_posterior$mu, sim_posterior$sigma)
[1] 0.4622347

17.1.5 Sample data

data = read_csv("bodytemps.csv")
Rows: 208 Columns: 1
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (1): bodytemp

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
head(data)
# A tibble: 6 × 1
  bodytemp
     <dbl>
1     95.6
2     97.3
3     98.0
4     97.0
5     96.0
6     98.6
mean(data$bodytemp)
[1] 97.6999
sd(data$bodytemp)
[1] 1.000204
data |>
  ggplot(aes(x = bodytemp)) +
  geom_histogram(aes(y = after_stat(density)),
                 col = "black", fill = "white") +
  geom_density(linewidth = 1) +
  theme_bw()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

17.1.6 Grid approximation of posterior given sample data

mu = seq(96, 100, 0.01)
sigma = seq(0.01, 3, 0.01)

# theta is all possible (mu, sigma) pairs
theta = expand.grid(mu, sigma)
names(theta) = c("mu", "sigma")

# prior
prior_mu = dnorm(mu, 98.6, 0.3)
prior_sigma = dnorm(sigma, 0, 1)
prior = apply(expand.grid(prior_mu, prior_sigma), 1, prod)
prior = prior / sum(prior)

# data
y = data$bodytemp

# likelihood - looping through each theta, but probably better/easier ways
likelihood = rep(NA, nrow(theta))

for (i in 1:nrow(theta)) {
  likelihood[i] = prod(dnorm(y, theta[i, "mu"], theta[i, "sigma"]))
}

# posterior
product = likelihood * prior
posterior = product / sum(product)
n_rep = 10000

# sample (mu, sigma) pairs for the posterior distribution by
# sampling rows of the Bayes table according to posterior
# and then finding the (mu, sigma) values for the sampled rows

index_sim = sample(1:length(posterior),
                   size = n_rep, replace = TRUE,
                   prob = posterior)

sim_posterior = theta |> slice(index_sim)

ggplot(sim_posterior,
       aes(x = mu,
           y = sigma)) +
  geom_point(color = bayes_col["posterior"], alpha = 0.4) +
  labs(x = "mean (mu)",
       y = "sd (sigma)") +
  theme_bw()

ggplot(sim_posterior,
       aes(x = mu,
           y = sigma)) +
  stat_density_2d(aes(fill = after_stat(level)),
                  geom = "polygon", color = "white") +
  scale_fill_viridis(option = "mako") +
  labs(x = "mean (mu)",
       y = "sd (sigma)") +
  theme_bw()
(a) Scatterplot.
(b) Density plot.
Figure 17.4: Grid-approximated joint posterior distribution of mean and SD of body temperatures in Example 17.2.
cor(sim_posterior$mu, sim_posterior$sigma)
[1] 0.0691847
mean(sim_posterior$mu)
[1] 97.74661
sd(sim_posterior$mu)
[1] 0.06813899
mean(sim_posterior$sigma)
[1] 1.004185
sd(sim_posterior$sigma)
[1] 0.0495512
sim_posterior |>
  ggplot(aes(x = mu)) +
  geom_histogram(aes(y = after_stat(density)),
                 col = bayes_col["posterior"], fill = "white") +
  geom_density(linewidth = 1, col = bayes_col["posterior"]) +
  theme_bw()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
sim_posterior |>
  ggplot(aes(x = sigma)) +
  geom_histogram(aes(y = after_stat(density)),
                 col = bayes_col["posterior"], fill = "white") +
  geom_density(linewidth = 1, col = bayes_col["posterior"]) +
  theme_bw()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
(a) Population mean (mu).
(b) Population SD (sigma).
Figure 17.5: Grid-approximated marginal distribution of mean and SD of body temperatures in Example 17.2.

17.1.7 Posterior simulation with brms

library(brms)
library(bayesplot)
library(tidybayes)

Next we fit the model to the data using the function brm. The three inputs are:

  • Input 1: the data
    data = data

  • Input 2: model for the data (likelihood)
    family = gaussian
    bodytemp ~ 1 # linear model with intercept only

  • Input 3: prior distribution for parameters
    prior(normal(98.6, 0.3), class = Intercept), prior(normal(0, 1), class = sigma))

  • Other arguments specify the details of the numerical algorithm; our specifications below will result in 10000 \(\theta\) values simulated from the posterior distribution. (It will simulate 3500 values (iter), discard the first 1000 (warmup), and then repeat 4 times (chains).)

fit <- brm(data = data,
           family = gaussian,
           bodytemp ~ 1,
           prior = c(prior(normal(98.6, 0.3), class = Intercept),
                     prior(normal(0, 1), class = sigma)),
           iter = 3500,
           warmup = 1000,
           chains = 4,
           refresh = 0)
Compiling Stan program...
Start sampling
summary(fit)
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: bodytemp ~ 1 
   Data: data (Number of observations: 208) 
  Draws: 4 chains, each with iter = 3500; warmup = 1000; thin = 1;
         total post-warmup draws = 10000

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    97.75      0.07    97.61    97.87 1.00     8101     6717

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     1.00      0.05     0.91     1.11 1.00     8166     6352

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
plot(fit)

posterior = fit |>
  spread_draws(b_Intercept, sigma) |>
  rename(mu = b_Intercept)
posterior |> head(10) |> kbl() |> kable_styling()
.chain .iteration .draw mu sigma
1 1 1 97.81859 1.1173924
1 2 2 97.78917 1.0936603
1 3 3 97.70184 0.9921338
1 4 4 97.76928 1.0270858
1 5 5 97.70994 1.0905127
1 6 6 97.78400 0.9790875
1 7 7 97.72138 0.9984059
1 8 8 97.69896 0.9934875
1 9 9 97.75091 1.0016151
1 10 10 97.73454 1.0291681

Marginal posterior distribution of \(\mu\)

posterior |>
  ggplot(aes(x = mu)) +
  geom_histogram(aes(y = after_stat(density)),
                 col = bayes_col["posterior"], fill = "white") +
  geom_density(linewidth = 1, col = bayes_col["posterior"]) +
  theme_bw()

# posterior mean of mu
mean(posterior$mu)
[1] 97.74613
# posterior sd of mu
sd(posterior$mu)
[1] 0.06570448
# posterior quantiles are endpoints of credible intervals for mu
quantile(posterior$mu, c(0.01, 0.10, 0.25, 0.50, 0.75, 0.90, 0.99))
      1%      10%      25%      50%      75%      90%      99% 
97.58944 97.66227 97.70334 97.74642 97.78941 97.82871 97.90012 

Marginal posterior distribution of \(\sigma\)

posterior |>
  ggplot(aes(x = sigma)) +
  geom_histogram(aes(y = after_stat(density)),
                 col = bayes_col["posterior"], fill = "white") +
  geom_density(linewidth = 1, col = bayes_col["posterior"]) +
  theme_bw()

# posterior mean of sigma
mean(posterior$sigma)
[1] 1.00433
# posterior sd of sigma
sd(posterior$sigma)
[1] 0.04949865
# posterior quantiles are endpoints of credible intervals for sigma
quantile(posterior$sigma, c(0.01, 0.10, 0.25, 0.50, 0.75, 0.90, 0.99))
       1%       10%       25%       50%       75%       90%       99% 
0.8956390 0.9427211 0.9703619 1.0024817 1.0365893 1.0691254 1.1259351 

Joint posterior distribution

ggplot(posterior,
       aes(x = mu,
           y = sigma)) +
  geom_point(color = bayes_col["posterior"], alpha = 0.4) +
  stat_ellipse(level = 0.98, color = "black", linewidth = 2) +
  stat_density_2d(color = "grey", linewidth = 1) +
  labs(x = "mean (mu)",
       y = "sd (sigma)") +
  theme_bw()

ggplot(posterior,
       aes(x = mu,
           y = sigma)) +
  stat_density_2d(aes(fill = after_stat(level)),
                  geom = "polygon", color = "white") +
  scale_fill_viridis(option = "mako") +
  labs(x = "mean (mu)",
       y = "sd (sigma)") +
  theme_bw()
(a) Scatterplot.
(b) Density plot.
Figure 17.6: Simulated joint posterior distribution of mean and SD of body temperatures in Example 17.4.
cor(posterior$mu, posterior$sigma)
[1] 0.0756806

17.1.8 Posterior prediction

n_rep = nrow(posterior)

posterior <- posterior |>
  mutate(y_sim = rnorm(n_rep,
                       mean = posterior$mu,
                       sd = posterior$sigma))

posterior |> head(10) |> kbl() |> kable_styling()
.chain .iteration .draw mu sigma y_sim
1 1 1 97.81859 1.1173924 96.81975
1 2 2 97.78917 1.0936603 98.84586
1 3 3 97.70184 0.9921338 96.54721
1 4 4 97.76928 1.0270858 97.47131
1 5 5 97.70994 1.0905127 98.47622
1 6 6 97.78400 0.9790875 98.35398
1 7 7 97.72138 0.9984059 97.04329
1 8 8 97.69896 0.9934875 99.37878
1 9 9 97.75091 1.0016151 96.78816
1 10 10 97.73454 1.0291681 96.78550
ggplot(posterior,
       aes(x = y_sim)) +
  geom_histogram(aes(y = after_stat(density)),
                 col = bayes_col["posterior_predict"], fill = "white", bins = 100) +
  geom_density(linewidth = 1, col = bayes_col["posterior_predict"]) +
  labs(x = "Predicted body temp") +
  theme_bw()

# plot the cdf
ggplot(posterior,
       aes(x = y_sim)) + 
  stat_ecdf(col = bayes_col["prior_predict"],
            linewidth = 1) +
  labs(x = "Body temperature (y)",
       y = "cdf") +
  theme_bw()

quantile(posterior$y_sim, c(0.025, 0.975))
    2.5%    97.5% 
95.73155 99.72476 

17.1.9 Posterior predictive checking

pp_check(fit, type = "hist")
Using 10 posterior draws for ppc type 'hist' by default.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

pp_check(fit, ndraw = 100)