SolidBin: improving metagenome binning with semi-supervised normalized cut
Benchmark (surveying)
Rand index
Sample (material)
Similarity (geometry)
DOI:
10.1093/bioinformatics/btz253
Publication Date:
2019-04-05T21:11:36Z
AUTHORS (5)
ABSTRACT
Abstract Motivation Metagenomic contig binning is an important computational problem in metagenomic research, which aims to cluster contigs from the same genome into group. Unlike classical clustering problem, can utilize known relationships among some of or taxonomic identity contigs. However, current state-of-the-art methods do not make full use additional biological information except coverage and sequence composition Results We developed a novel method, Semi-supervised Spectral Normalized Cut for Binning (SolidBin), based on semi-supervised spectral clustering. Using feature similarity and/or information, such as reliable taxonomy assignments contigs, SolidBin constructs two types prior information: must-link cannot-link constraints. Must-link constraints mean that pair should be clustered group, while different groups. These are then integrated approach, normalized cut, improved binning. The performance compared with five binners, CONCOCT, COCACOLA, MaxBin, MetaBAT BMC3C next-generation sequencing benchmark datasets including simulated multi- single-sample real multi-sample datasets. experimental results show that, has achieved best terms F-score, Adjusted Rand Index Mutual Information, especially using dataset. Availability implementation https://github.com/sufforest/SolidBin. Supplementary data available at Bioinformatics online.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (34)
CITATIONS (61)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....