Chapter 2 Simulation Approach for Individual-Level Topic Weights
2.1 Model Description
2.1.1 Hierarchical Model Structure: Topic Weights (ηik(t))
We model the individual-level topic weights using Gaussian Processes (GPs) with genetic effects. The prior for the topic weights ηik(t) is given by:
ηik(t)∼GP(γkgi,Kη,k(t,t′))
Here: - ηik(t) represents the topic weight for individual i and topic k at time t. - γkgi represents the genetic effects for individual i on topic k. - Kη,k(t,t′) is the covariance function for the GP, which is shared across all individuals for a given topic k.
The observed noisy weights ˆwik are modeled as:
ˆwik(t)∣ηik(t)∼N(ηik(t),σ2noiseI)
The posterior distribution of the topic weights given the observed data is:
ηik(t)∣data∼GP(μik,posterior(t)+γkgi,Kik,posterior(t,t′))
Here, μik,posterior(t) is the posterior mean and Kik,posterior(t,t′) is the posterior covariance function, which incorporate the observed data.
2.1.2 Posterior Inference Derivation
Given the observed weights yik(t), the goal is to infer the posterior distribution of the true topic weights ηik(t). The prior distribution for ηik(t) is:
ηik(t)∼GP(γkgi,Kη,k(t,t′))
The observed weights are modeled as:
yik(t)∣ηik(t)∼N(ηik(t),σ2noiseI)
The posterior distribution of ηik(t) given the observed data y is:
ηik(t)∣y∼GP(μik,posterior(t),Kik,posterior(t,t′))
where:
μik,posterior(t)=K(t,t)[K(t,t)+σ2noiseI]−1(y−γkgi)
Kik,posterior(t,t′)=K(t,t′)−K(t,t)[K(t,t)+σ2noiseI]−1K(t,t′)
The posterior mean μik,posterior(t) incorporates the genetic effects γkgi and the observed data. The posterior covariance Kik,posterior describes the uncertainty in the topic weights after accounting for the observed data.
2.2 Practical Steps for Fitting and Simulation
2.2.1 Simulate Individual-Level Weights
Fit a GP to the observed data to estimate the posterior mean function μposterior,k(t) and the posterior covariance function Kk,posterior(t,t′).
From the posterior GP, you can draw samples to generate new realizations of the process. This incorporates both the learned mean and the covariance structure. To simplify things, we will add the posterior mean for each topic to the genetic effects γkgi and draw from the distribution using this new posterior mean and topic level posterior variance:
ηik(t)∣data∼GP(μk,posterior(t)+γkgi,Kk,posterior(t,t′))