HGC: fast hierarchical clustering for large-scale single-cell data
0301 basic medicine
Benchmarking
Genetic Heterogeneity
03 medical and health sciences
0206 medical engineering
Cluster Analysis
02 engineering and technology
Software
Algorithms
DOI:
10.1093/bioinformatics/btab420
Publication Date:
2021-06-04T11:44:02Z
AUTHORS (3)
ABSTRACT
Abstract
Summary
Clustering is a key step in revealing heterogeneities in single-cell data. Most existing single-cell clustering methods output a fixed number of clusters without the hierarchical information. Classical hierarchical clustering (HC) provides dendrograms of cells, but cannot scale to large datasets due to high computational complexity. We present HGC, a fast Hierarchical Graph-based Clustering tool to address both problems. It combines the advantages of graph-based clustering and HC. On the shared nearest-neighbor graph of cells, HGC constructs the hierarchical tree with linear time complexity. Experiments showed that HGC enables multiresolution exploration of the biological hierarchy underlying the data, achieves state-of-the-art accuracy on benchmark data and can scale to large datasets.
Availability and implementation
The R package of HGC is available at https://bioconductor.org/packages/HGC/.
Supplementary information
Supplementary data are available at Bioinformatics online.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (5)
CITATIONS (20)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....