Chapter 13 Probabilistic Model Formulation

Notations

  • d : Index for an individual

  • t : Time point

  • w{d,t}_ : Set of new diagnoses observed for individual d at time t

  • k : Topic index

  • v : Disease index

  • η{t,k}_ : Natural parameters of topic k at time t

  • π{v,k,t}_: Probability of diagnosis v within topic k at time t

  • θ{d,t}_ : Topic proportions for individual d at time t

  • γ_d : Genetic stickiness parameter for individual d

  • W(t, ρ_d) : Time-warping function for individual d

  • ρ_d : Individual-specific warping parameter

Gaussian Process for Topic Evolution

  • Assume Squared Exponential Kernel: k(t,t)=σ2exp(1/2(tt)2/l2)
  • Likelihood: p(\eta_{t,k} | \mu(t), k(t, t')) = \mathcal{N}(μ(t), k(t, t'))

Topic Distribution

  • Prior (Dirichlet distribution): p(\theta_{d,t=0} | \alpha) = \text{Dirichlet}(α)
  • Update Likelihood: p(\theta_{d,t} | \eta_t, w_{d,t}, \gamma_d) \propto \prod_{k=1}^K \left[\theta_{d, t-1, k} \cdot \prod_{v \in w_{d,t}} \pi_{v,k,t} \right]^{\gamma_d}

Predictive Likelihood for New Diagnoses p(w_{d,t+\Delta t} | θ_{d,t}, η_{t+\Delta t}) = \prod_{v \in w_{d,t+\Delta t}} \sum_k \theta_{d,t,k} * \pi_{v,k,t+\Delta t}