Evaluating single-cell cluster stability using the Jaccard similarity index
0301 basic medicine
03 medical and health sciences
Base Sequence
Sequence Analysis, RNA
Exome Sequencing
Cluster Analysis
Single-Cell Analysis
Algorithms
DOI:
10.1093/bioinformatics/btaa956
Publication Date:
2020-11-02T20:13:55Z
AUTHORS (7)
ABSTRACT
Abstract
Motivation
One major goal of single-cell RNA sequencing (scRNAseq) experiments is to identify novel cell types. With increasingly large scRNAseq datasets, unsupervised clustering methods can now produce detailed catalogues of transcriptionally distinct groups of cells in a sample. However, the interpretation of these clusters is challenging for both technical and biological reasons. Popular clustering algorithms are sensitive to parameter choices, and can produce different clustering solutions with even small changes in the number of principal components used, the k nearest neighbor and the resolution parameters, among others.
Results
Here, we present a set of tools to evaluate cluster stability by subsampling, which can guide parameter choice and aid in biological interpretation. The R package scclusteval and the accompanying Snakemake workflow implement all steps of the pipeline: subsampling the cells, repeating the clustering with Seurat and estimation of cluster stability using the Jaccard similarity index and providing rich visualizations.
Availabilityand implementation
R package scclusteval: https://github.com/crazyhottommy/scclusteval Snakemake workflow: https://github.com/crazyhottommy/pyflow_seuratv3_parameter Tutorial: https://crazyhottommy.github.io/EvaluateSingleCellClustering/.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (14)
CITATIONS (74)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....