Chapter 10 Probabilistic Model Formulation

Notations

  • d : Index for an individual

  • t : Time point

  • w{d,t}_ : Set of new diagnoses observed for individual d at time t

  • k : Topic index

  • v : Disease index

  • η{t,k}_ : Natural parameters of topic k at time t

  • π{v,k,t}_: Probability of diagnosis v within topic k at time t

  • θ{d,t}_ : Topic proportions for individual d at time t

  • γ_d : Genetic stickiness parameter for individual d

  • W(t, ρ_d) : Time-warping function for individual d

  • ρ_d : Individual-specific warping parameter

Gaussian Process for Topic Evolution

  • Assume Squared Exponential Kernel: \[k(t, t') = \sigma^2 * \exp(-1/2 * (t - t')^2 / l^2)\]
  • Likelihood: \[p(\eta_{t,k} | \mu(t), k(t, t')) = \mathcal{N}(μ(t), k(t, t'))\]

Topic Distribution

  • Prior (Dirichlet distribution): \[p(\theta_{d,t=0} | \alpha) = \text{Dirichlet}(α)\]
  • Update Likelihood: \[p(\theta_{d,t} | \eta_t, w_{d,t}, \gamma_d) \propto \prod_{k=1}^K \left[\theta_{d, t-1, k} \cdot \prod_{v \in w_{d,t}} \pi_{v,k,t} \right]^{\gamma_d}\]

Predictive Likelihood for New Diagnoses \[p(w_{d,t+\Delta t} | θ_{d,t}, η_{t+\Delta t}) = \prod_{v \in w_{d,t+\Delta t}} \sum_k \theta_{d,t,k} * \pi_{v,k,t+\Delta t}\]