Chapter 12 Likelihood Components with Warping:

Gaussian Process Likelihood: (Remains the same)

\[p(\eta_{t,k} | μ(t), k(t, t')) = \mathcal{N}(μ(t), k(t, t')) \] Topic Distribution Prior: (Remains the same)

\[p(\theta_{d,t=0} | \alpha) = \text{Dirichlet}(\alpha)\] Topic Distribution Update Likelihood (Incorporating Warping):

Conceptual: The likelihood of observing the new diagnoses given the topic distribution is now conditioned on the individual’s warped time. Representation: \[p(\theta_{d,t} | η_t, w_{d,t}, \gamma_d, ρ_d ) \propto \prod_{k=1}^K \left[\theta_{d, t-1, k} \cdot \prod_{v \in w_{d,t}} \pi_{v,k,W(t, ρ_d)} \right]^{\gamma_d}\]

Note the \(W(t, \rho_d)\) in the subscript of \(\pi_{v,k,W(t, \rho_d)}\), highlighting that disease probabilities are obtained from the warped timeline. Predictive Likelihood (Incorporating Warping):

Similarly, prediction depends on the warped time for the individual: p(w_{d,t+t} | θ_{d,t}, η_{t+t}, ρ_d) = {v w{d,t+t}} k {d,t,k} * _{v,k,W(t + t, ρ_d)} Key Points

Indirect Representation: The effect of warping is implicit in the modified likelihoods through the warped time index W(t, ρ_d). Procedural Implementation: Computationally, you’d likely have procedural steps to determine the appropriate values of η and π corresponding to the warped time of an individual.

library(PheWAS) library(data.table) library(stringr) library(stringi) library(lubridate) library(ATM) library(dplyr) library(tidyr) library(ggridges) library(ggplot2)

df_with_prs=readRDS(“~/Library/CloudStorage/Dropbox-Personal/pheno_dir/prs_subset.rds”) a=data.frame(apply(df_with_prs[,-37],2,scale)) a\(Id=df_with_prs\)Identifier library(reshape2); m=melt(a,id.vars=“Id”)

ggplot(m,aes(x=value,y=variable,fill=variable))+ geom_density_ridges()+theme_ridges() “)

i=readRDS(“~/Library/CloudStorage/Dropbox-Personal/icd10phe_lab.rds”) a=i%>%left_join(disease_info_phecode_icd10, by = c(“diag_icd10”=“phecode” )) df_good=a[,c(“eid”,“age_diag”,“phenotype”)]

df_grouped <- df_good %>% group_by(eid, phenotype) %>% summarize(first_age_diag = min(age_diag, na.rm = TRUE))

df_wide <- df_grouped %>% pivot_wider(names_from = phenotype, values_from = first_age_diag) dim(df_wide)

col=df_wide[,(colSums(!is.na(df_wide))>1000)] row=col[(rowSums(!is.na(col))>2),] colrow=row[,(colSums(!is.na(row))>1000)]

p=data.frame(apply(colrow[,-1],2,function(x){ifelse(is.na(x),0,1)}))