Chapter 21 First Round Clustering Based on Tile Matrix

library(ArchR)
library(magrittr)
library(tidyverse)
set.seed(1)

21.1 Description

  • Aim to cluster cells, we try a two round clustering strategy:
    • First round: based on 500 bp tile matrix
    • Second round: based on union peak sets merged from cluster peaks
  • Both rounds use the iterative LSI

21.2 Set env and load arrow project

## Section: set default para
##################################################
addArchRThreads(threads = 16) # setting default number of parallel threads


## Section: load object
##################################################
proj <- loadArchRProject(path = "data/ArchR/ArrowProject/Merged/")

21.3 LSI, clustering and embedding

proj %>%
  addIterativeLSI(
    # first round iterative LSI based on tilematrix, with default para, it will carry out estimated LSI
    ArchRProj = .,
    useMatrix = "TileMatrix",
    name = "IterativeLSI_tile",
    force = TRUE
  ) %>%
  addClusters(
    # add cluster based on LSI using seurat
    input = .,
    reducedDims = "IterativeLSI_tile",
    force = TRUE) %>%
  addUMAP(
    # add embedding
    ArchRProj = .,
    reducedDims = "IterativeLSI_tile",
    name = "UMAP_tile",
    nNeighbors = 30,
    force = TRUE
  )
## Checking Inputs...
## ArchR logging to : ArchRLogs/ArchR-addIterativeLSI-131c165a8193-Date-2021-11-12_Time-14-54-13.log
## If there is an issue, please report to github with logFile!
## 2021-11-12 14:54:16 : Computing Total Across All Features, 0.003 mins elapsed.
## 2021-11-12 14:54:25 : Computing Top Features, 0.152 mins elapsed.
## ###########
## 2021-11-12 14:54:27 : Running LSI (1 of 2) on Top Features, 0.176 mins elapsed.
## ###########
## 2021-11-12 14:54:27 : Creating Partial Matrix, 0.177 mins elapsed.
## 2021-11-12 14:54:34 : Computing LSI, 0.301 mins elapsed.
## 2021-11-12 14:54:47 : Identifying Clusters, 0.511 mins elapsed.
## 2021-11-12 14:54:58 : Identified 6 Clusters, 0.703 mins elapsed.
## 2021-11-12 14:54:58 : Saving LSI Iteration, 0.703 mins elapsed.
## 2021-11-12 14:55:14 : Creating Cluster Matrix on the total Group Features, 0.97 mins elapsed.
## 2021-11-12 14:57:08 : Computing Variable Features, 2.858 mins elapsed.
## ###########
## 2021-11-12 14:57:08 : Running LSI (2 of 2) on Variable Features, 2.863 mins elapsed.
## ###########
## 2021-11-12 14:57:08 : Creating Partial Matrix, 2.863 mins elapsed.
## 2021-11-12 14:57:22 : Computing LSI, 3.093 mins elapsed.
## 2021-11-12 14:57:35 : Finished Running IterativeLSI, 3.322 mins elapsed.
## ArchR logging to : ArchRLogs/ArchR-addClusters-131c4451f2d7-Date-2021-11-12_Time-14-57-35.log
## If there is an issue, please report to github with logFile!
## Overriding previous entry for Clusters
## 2021-11-12 14:57:36 : Running Seurats FindClusters (Stuart et al. Cell 2019), 0.003 mins elapsed.
## Computing nearest neighbor graph
## Computing SNN
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 3000
## Number of edges: 135042
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.7866
## Number of communities: 10
## Elapsed time: 0 seconds
## 2021-11-12 14:57:43 : Testing Outlier Clusters, 0.111 mins elapsed.
## 2021-11-12 14:57:43 : Assigning Cluster Names to 10 Clusters, 0.111 mins elapsed.
## 2021-11-12 14:57:43 : Finished addClusters, 0.114 mins elapsed.
## 14:57:43 UMAP embedding parameters a = 0.7669 b = 1.223
## 14:57:43 Read 3000 rows and found 30 numeric columns
## 14:57:43 Using Annoy for neighbor search, n_neighbors = 30
## 14:57:43 Building Annoy index with metric = cosine, n_trees = 50
## 0%   10   20   30   40   50   60   70   80   90   100%
## [----|----|----|----|----|----|----|----|----|----|
## **************************************************|
## 14:57:44 Writing NN index file to temp file /tmp/Rtmpo18Qqo/file131c58c87e2f
## 14:57:44 Searching Annoy index using 32 threads, search_k = 3000
## 14:57:44 Annoy recall = 100%
## 14:57:45 Commencing smooth kNN distance calibration using 32 threads
## 14:57:46 Initializing from normalized Laplacian + noise
## 14:57:47 Commencing optimization for 500 epochs, with 133184 positive edges
## 14:58:00 Optimization finished
## 14:58:00 Creating temp model dir /tmp/Rtmpo18Qqo/dir131c5c7bb39f
## 14:58:00 Creating dir /tmp/Rtmpo18Qqo/dir131c5c7bb39f
## 14:58:00 Changing to /tmp/Rtmpo18Qqo/dir131c5c7bb39f
## 14:58:00 Creating /lustre/user/liclab/liuyt/github/SingleCell-MultiOmics_Process_Visualization/data/ArchR/ArrowProject/Merged/Embeddings/Save-Uwot-UMAP-Params-IterativeLSI_tile-131c5b231369-Date-2021-11-12_Time-14-58-00.tar
## 
##            ___      .______        ______  __    __  .______      
##           /   \     |   _  \      /      ||  |  |  | |   _  \     
##          /  ^  \    |  |_)  |    |  ,----'|  |__|  | |  |_)  |    
##         /  /_\  \   |      /     |  |     |   __   | |      /     
##        /  _____  \  |  |\  \\___ |  `----.|  |  |  | |  |\  \\___.
##       /__/     \__\ | _| `._____| \______||__|  |__| | _| `._____|
## 
## class: ArchRProject 
## outputDirectory: /lustre/user/liclab/liuyt/github/SingleCell-MultiOmics_Process_Visualization/data/ArchR/ArrowProject/Merged 
## samples(4): 68A 68B 84B 84C
## sampleColData names(1): ArrowFiles
## cellColData names(22): Sample TSSEnrichment ... ReadsInPeaks FRIP
## numberOfCells(1): 3000
## medianTSS(1): 8.733
## medianFrags(1): 10066.5

21.4 Save arrow project

saveArchRProject(ArchRProj = proj)
## class: ArchRProject 
## outputDirectory: /lustre/user/liclab/liuyt/github/SingleCell-MultiOmics_Process_Visualization/data/ArchR/ArrowProject/Merged 
## samples(4): 68A 68B 84B 84C
## sampleColData names(1): ArrowFiles
## cellColData names(22): Sample TSSEnrichment ... ReadsInPeaks FRIP
## numberOfCells(1): 3000
## medianTSS(1): 8.733
## medianFrags(1): 10066.5