HGC: fast hierarchical clustering for large-scale single-cell data

0301 basic medicine Benchmarking Genetic Heterogeneity 03 medical and health sciences 0206 medical engineering Cluster Analysis 02 engineering and technology Software Algorithms
DOI: 10.1093/bioinformatics/btab420 Publication Date: 2021-06-04T11:44:02Z
ABSTRACT
Abstract Summary Clustering is a key step in revealing heterogeneities in single-cell data. Most existing single-cell clustering methods output a fixed number of clusters without the hierarchical information. Classical hierarchical clustering (HC) provides dendrograms of cells, but cannot scale to large datasets due to high computational complexity. We present HGC, a fast Hierarchical Graph-based Clustering tool to address both problems. It combines the advantages of graph-based clustering and HC. On the shared nearest-neighbor graph of cells, HGC constructs the hierarchical tree with linear time complexity. Experiments showed that HGC enables multiresolution exploration of the biological hierarchy underlying the data, achieves state-of-the-art accuracy on benchmark data and can scale to large datasets. Availability and implementation The R package of HGC is available at https://bioconductor.org/packages/HGC/. Supplementary information Supplementary data are available at Bioinformatics online.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (5)
CITATIONS (20)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....