Introduction to Markov Chain Monte Carlo (MCMC) Methods
Monte Carlo Integration
The true value of \(\int_{-1}^1 e^{-x^2/2}dx\) is 1.7112488
1000 simulated reps
N =1000# simulate values uniformly in the boxu1 =runif(N, -1, 1)u2 =runif(N, 0, 1)# find the proportion of values that lie under the curve# and multiple by the area of the boxI =2*sum( u2 <exp(-u1 ^2/2)) / NI
N =100000# simulate values uniformly in the boxu1 =runif(N, -1, 1)u2 =runif(N, 0, 1)# find the proportion of values that lie under the curve# and multiple by the area of the boxI =2*sum( u2 <exp(-u1 ^2/2)) / NI
n =100n_rep =100000# Simulate values of X from the marginal distributionx_sim =rnorm(n_rep, 0.75, 0.05)# For each simulated value of x, simulate a value of y from Binomial(n, x)y_sim =rbinom(n_rep, n, x_sim)sim =data.frame(x_sim, y_sim)# Display the (x, y) results of a few repetitionssim |>head(10) |>kbl() |>kable_styling()
# Observed datay_obs =60# Only keep simulated (x, y) pairs with y = 60sim |>filter(y_sim == y_obs) |>head(10) |>kbl() |>kable_styling()
x_sim
y_sim
0.7013796
60
0.7385782
60
0.6319136
60
0.6542770
60
0.7027279
60
0.6477597
60
0.6601650
60
0.6739087
60
0.6698551
60
0.6421359
60
Approximate joint distribution with condition highlighted
# Plot the simulated (x, y) pairs from before, with y = 60 highlightedx_y_plot +annotate("rect", xmin =-Inf, xmax =Inf,ymin = y_obs -0.4, ymax = y_obs +0.4, alpha =0.5,color ="seagreen",fill ="seagreen")
Approximate conditional distribution
# Plot the simulated conditional distribution of X given y = 60sim |>filter(y_sim == y_obs) |>ggplot(aes(x = x_sim)) +geom_histogram(col ="seagreen", fill ="seagreen")
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
Started with 100,000 pairs, but conditional distribution is only based on roughly 500.
# number of stepsN =100000# number of flipsn =3# initializeXn =matrix(rep(NA, N * n), nrow = N)# start with all tailsXn[1, ] =rep(0, n)for (m in2:N){# current state; all but one flip will stay the same x = Xn[m-1, ]# index of which flip to propose to switch i =sample(1:n, 1) # if proposed switch results in good sequence, accept# otherwise, reject and stay in current state# annoying code for endpointsif (x[i] ==1){ x[i] =0 }else{ #if x[i]==0if((i ==1&& x[i+1] ==0) || (i == n && x[i-1] ==0) || (i>1&& i<n && x[i-1] ==0&& x[i+1] ==0)){ x[i] =1 } else { x[i] =0 } } Xn[m, ] = x}# find number of headsY =rowSums(Xn)plot(table(Y) / N, xlab ="number of H",ylab ="relative frequency",main =paste(N, "good sequences"))
mean(Y)
[1] 0.99665
100 flips - same code, just changing 3 to 100
# number of stepsN =100000# number of flipsn =100# initializeXn =matrix(rep(NA, N * n), nrow = N)# start with all tailsXn[1, ] =rep(0, n)for (m in2:N){# current state; all but one flip will stay the same x = Xn[m-1, ]# index of which flip to propose to switch i =sample(1:n, 1) # if proposed switch results in good sequence, accept# otherwise, reject and stay in current state# annoying code for endpointsif (x[i] ==1){ x[i] =0 }else{ #if x[i]==0if((i ==1&& x[i+1] ==0) || (i == n && x[i-1] ==0) || (i>1&& i<n && x[i-1] ==0&& x[i+1] ==0)){ x[i] =1 } else { x[i] =0 } } Xn[m, ] = x}# find number of headsY =rowSums(Xn)plot(table(Y) / N, xlab ="number of H",ylab ="relative frequency",main =paste(N, "good sequences"))