Chapter 12 First Round Clustering Based on Tile Matrix

library(ArchR)
library(magrittr)
library(tidyverse)
set.seed(1)

12.1 Description

  • Aim to cluster cells, we try a two round clustering strategy:
    • First round: based on 500 bp tile matrix
    • Second round: based on union peak sets merged from cluster peaks
  • Both rounds use the iterative LSI

12.2 Set env and load arrow project

## Section: set default para
##################################################
addArchRThreads(threads = 16) # setting default number of parallel threads


## Section: load object
##################################################
proj <- loadArchRProject(path = "data/ArchR/ArrowProject/Merged/")

12.3 LSI, clustering and embedding

proj %>%
  addIterativeLSI(
    # first round iterative LSI based on tilematrix, with default para, it will carry out estimated LSI
    ArchRProj = .,
    useMatrix = "TileMatrix",
    name = "IterativeLSI_tile",
    force = TRUE
  ) %>%
  addClusters(
    # add cluster based on LSI using seurat
    input = .,
    reducedDims = "IterativeLSI_tile",
    force = TRUE) %>%
  addUMAP(
    # add embedding
    ArchRProj = .,
    reducedDims = "IterativeLSI_tile",
    name = "UMAP_tile",
    nNeighbors = 30,
    force = TRUE
  )
## Checking Inputs...
## ArchR logging to : ArchRLogs/ArchR-addIterativeLSI-13b296b2709bf-Date-2021-07-26_Time-20-59-10.log
## If there is an issue, please report to github with logFile!
## 2021-07-26 20:59:13 : Computing Total Across All Features, 0.003 mins elapsed.
## 2021-07-26 20:59:19 : Computing Top Features, 0.116 mins elapsed.
## ###########
## 2021-07-26 20:59:21 : Running LSI (1 of 2) on Top Features, 0.138 mins elapsed.
## ###########
## 2021-07-26 20:59:21 : Creating Partial Matrix, 0.138 mins elapsed.
## 2021-07-26 20:59:27 : Computing LSI, 0.247 mins elapsed.
## 2021-07-26 20:59:40 : Identifying Clusters, 0.457 mins elapsed.
## 2021-07-26 20:59:52 : Identified 6 Clusters, 0.652 mins elapsed.
## 2021-07-26 20:59:52 : Saving LSI Iteration, 0.652 mins elapsed.
## 2021-07-26 21:00:05 : Creating Cluster Matrix on the total Group Features, 0.875 mins elapsed.
## 2021-07-26 21:00:14 : Computing Variable Features, 1.027 mins elapsed.
## ###########
## 2021-07-26 21:00:14 : Running LSI (2 of 2) on Variable Features, 1.032 mins elapsed.
## ###########
## 2021-07-26 21:00:14 : Creating Partial Matrix, 1.032 mins elapsed.
## 2021-07-26 21:00:23 : Computing LSI, 1.178 mins elapsed.
## 2021-07-26 21:00:34 : Finished Running IterativeLSI, 1.354 mins elapsed.
## ArchR logging to : ArchRLogs/ArchR-addClusters-13b2920592bdc-Date-2021-07-26_Time-21-00-34.log
## If there is an issue, please report to github with logFile!
## Overriding previous entry for Clusters
## 2021-07-26 21:00:34 : Running Seurats FindClusters (Stuart et al. Cell 2019), 0.003 mins elapsed.
## Computing nearest neighbor graph
## Computing SNN
## Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck
## 
## Number of nodes: 3000
## Number of edges: 135042
## 
## Running Louvain algorithm...
## Maximum modularity in 10 random starts: 0.7866
## Number of communities: 10
## Elapsed time: 0 seconds
## 2021-07-26 21:00:41 : Testing Outlier Clusters, 0.111 mins elapsed.
## 2021-07-26 21:00:41 : Assigning Cluster Names to 10 Clusters, 0.111 mins elapsed.
## 2021-07-26 21:00:41 : Finished addClusters, 0.113 mins elapsed.
## 21:00:41 UMAP embedding parameters a = 0.7669 b = 1.223
## 21:00:41 Read 3000 rows and found 30 numeric columns
## 21:00:41 Using Annoy for neighbor search, n_neighbors = 30
## 21:00:41 Building Annoy index with metric = cosine, n_trees = 50
## 0%   10   20   30   40   50   60   70   80   90   100%
## [----|----|----|----|----|----|----|----|----|----|
## **************************************************|
## 21:00:42 Writing NN index file to temp file /tmp/RtmpVwfRkp/file13b291712fa27
## 21:00:42 Searching Annoy index using 32 threads, search_k = 3000
## 21:00:42 Annoy recall = 100%
## 21:00:42 Commencing smooth kNN distance calibration using 32 threads
## 21:00:43 Initializing from normalized Laplacian + noise
## 21:00:43 Commencing optimization for 500 epochs, with 133184 positive edges
## 21:00:57 Optimization finished
## 21:00:57 Creating temp model dir /tmp/RtmpVwfRkp/dir13b29407d8257
## 21:00:57 Creating dir /tmp/RtmpVwfRkp/dir13b29407d8257
## 21:00:57 Changing to /tmp/RtmpVwfRkp/dir13b29407d8257
## 21:00:57 Creating /lustre/user/liclab/liuyt/github/SingleCell-MultiOmics_Process_Visualization/data/ArchR/ArrowProject/Merged/Embeddings/Save-Uwot-UMAP-Params-IterativeLSI_tile-13b295dc5727e-Date-2021-07-26_Time-21-00-57.tar
## 
##            ___      .______        ______  __    __  .______      
##           /   \     |   _  \      /      ||  |  |  | |   _  \     
##          /  ^  \    |  |_)  |    |  ,----'|  |__|  | |  |_)  |    
##         /  /_\  \   |      /     |  |     |   __   | |      /     
##        /  _____  \  |  |\  \\___ |  `----.|  |  |  | |  |\  \\___.
##       /__/     \__\ | _| `._____| \______||__|  |__| | _| `._____|
## 
## class: ArchRProject 
## outputDirectory: /lustre/user/liclab/liuyt/github/SingleCell-MultiOmics_Process_Visualization/data/ArchR/ArrowProject/Merged 
## samples(4): 68A 68B 84B 84C
## sampleColData names(1): ArrowFiles
## cellColData names(22): Sample TSSEnrichment ... ReadsInPeaks FRIP
## numberOfCells(1): 3000
## medianTSS(1): 8.733
## medianFrags(1): 10066.5

12.4 Save arrow project

saveArchRProject(ArchRProj = proj)
## class: ArchRProject 
## outputDirectory: /lustre/user/liclab/liuyt/github/SingleCell-MultiOmics_Process_Visualization/data/ArchR/ArrowProject/Merged 
## samples(4): 68A 68B 84B 84C
## sampleColData names(1): ArrowFiles
## cellColData names(22): Sample TSSEnrichment ... ReadsInPeaks FRIP
## numberOfCells(1): 3000
## medianTSS(1): 8.733
## medianFrags(1): 10066.5