D Characterization of Stemness-induced Heterogeneity on Proteomics Networks in Breast Cancer
D.1 Data description
The dataset comes from The Cancer Genome Atlas (TCGA, Weinstein et al. (2013)) and the Cancer Proteome Atlas (TCPA, Li et al. (2013)), and is processed by Malta et al. (2018) to derive two independent stemness indices based on DNA methylation (mDNAsi) and mRNA expression (mRNAsi). Aim to provide the degrees of dedifferentiation on epigenetic and gene expression level, mDNAsi and mRNAsi range from 0 to 1 with lower values implying tendency to normal-like cells. 189 protein abundance are measured across 616 breast cancer (BRCA) from TCGA (Weinstein et al. 2013) and can be downloaded from the NIH Genomic Data Commons (GDC) website.
D.2 Preprocessing and application
The mRNAsi, mDNAsi and patients’ ages are treated as three intrinsic factors. Logit transformation is used to the stemness indices to ensure the same scale as age, and all three intrinsic factors and proteomics data are standardized before plugging into the model. Hyperparameters are set as previous section C.2.
Here we present proteomics data analysis with mDNAsi in breast cancers where mRNAsi and patients’ age are set to be median, and mRNAsi related result can be found in the Result Section. For mDNAsi case, we limit the protein connections by using the following criteria
proteins pair with different isoforms,
significant correlation (FDR based p-values \(<\) \(0.01\)) in more than half of the cases,
the magnitude of correlations exceeded \(0.2\) for at least one case.
D.3 Results
Figure D.1 and D.2 are two heatmaps which show posterior inclusion probability (PIP) and partial correlation of selected edges with varying mDNAsi while mRNAsi and age are set to be median. The color bar on the left shows quarterly divided mDNAsi. The barplot on top represents the number of significant cases for the corresponding protein pair.
Figure D.3 presents networks for proteins with top five connectivity degrees corresponded to each quarter of mDNAsi. The width of edges are proportional to the median value of partial correlation between selected protein pairs for each quarter of mDNAsi. The node sizes reflects median values of connectivity degrees.
As shown in Figure D.4 we also present changes of correlation along with mDNAsi of the original scale for selected protein pairs when mRNAsi and age are fixed to be median.
Finally, the results based on integrated functional analysis of pathways are shown in D.5. This plot illustrates the changing patterns of connectivity scores of pathways along with mDNAsi of the origin scale.