F Gene Expression Network Estimation incorporating Spatial Heterogeneity

F.1 Data description

The human breast cancer data was collected from biopsy of breast cancer at a thickness of 16 \(\mu\)m (Ståhl et al. 2016). Based on the Hematoxylin and Eosin (H&E) staining image, locations can be classified into three spatial regions as tumor, intermediate, and normal with the sizes 114, 67, and 69 spots respectively. The data includes measurement of 5262 genes expression at 250 spot locations.

F.2 Preprocessing and application

Here we only consider the 100 spatially expressed genes with the lowest Benjamini-Hochberg (BH) adjusted p-value by applying SPARK method (Sun, Zhu, and Zhou 2020). Next, we apply the PQLseq (Sun et al. 2019) algorithm to adjust for the covariate effect and obtain the latent gene expressions which follow Normal distribution.

Two coordinates are scaled and treated as intrinsic factors in the GraphR model. Hyperparameters are set as previous section C.2. We include correlations with FDR based p-values \(<0.01\) in the results.

F.3 Results

Suppose the partial correlations between gene \(i\) and gene \(j\) and the corresponding posterior inclusion probabilities are vectors of length \(n_1+n_2+n_3\) with \(n_1,n_2,n_3\) representing number of spots in tumor, intermediate and normal region, namely \[\begin{equation} \begin{split} & \rho_{ij} = [\rho_{ij}^{tumor}, \rho_{ij}^{inter}, \rho_{ij}^{normal}] \in \mathbb{R}^{n_1+n_2+n_3}, \\ & PIP_{ij} = [PIP_{ij}^{tumor}, PIP_{ij}^{inter}, PIP_{ij}^{normal}] \in \mathbb{R}^{n_1+n_2+n_3}. \end{split} \end{equation}\] We define weighted average of partial correlations between gene \(i\) and gene \(j\) in a region as \(\hat{\rho}_{ij}^{region}= (\sum_{region} \rho_{ij}^{region} * PIP_{ij}^{region})/n_{region}\) where region can be tumor, intermediate or normal. Weighted connectivity degree of gene \(i\) is defined as the sum of \(|\hat{\rho}_{i\cdot}|\).

Figure F.1, F.2 and F.3 show networks w.r.t. each spatial regions. The edges are proportional to the weighted average of partial correlations and nodes are proportional to the weighted connectivity degrees.

Tumor region

Figure F.1: Network of tumor region in breast cancer.

Intermediate region

Figure F.2: Network of intermediate region in breast cancer.

Normal region

Figure F.3: Network of normal region in breast cancer.

We also display more spatial patterns of partial correlations (Figure F.4) and connectivity degrees (Figure F.5) for gene and gene pairs. The color bar indicates the values of correlations and connectivity degrees while shapes of point representing the spatial region.

Figure F.4: Spatial pattern of partial correlations for selective gene pairs.

Figure F.5: Spatial pattern of connectivity degrees of selective genes.

References

Ståhl, Patrik L, Fredrik Salmén, Sanja Vickovic, Anna Lundmark, José Fernández Navarro, Jens Magnusson, Stefania Giacomello, et al. 2016. “Visualization and Analysis of Gene Expression in Tissue Sections by Spatial Transcriptomics.” Science 353 (6294): 78–82.

Sun, Shiquan, Jiaqiang Zhu, Sahar Mozaffari, Carole Ober, Mengjie Chen, and Xiang Zhou. 2019. “Heritability Estimation and Differential Analysis of Count Data with Generalized Linear Mixed Models in Genomic Sequencing Studies.” Bioinformatics 35 (3): 487–96.

Sun, Shiquan, Jiaqiang Zhu, and Xiang Zhou. 2020. “Statistical Analysis of Spatial Expression Patterns for Spatially Resolved Transcriptomic Studies.” Nature Methods 17 (2): 193–200.