Swarm v3: towards tera-scale amplicon clustering

ddc:004 0301 basic medicine 570 taxonomie logiciel http://aims.fao.org/aos/agrovoc/c_24008 http://aims.fao.org/aos/agrovoc/c_27812 operational taxonomic units F30 - Génétique et amélioration des plantes http://aims.fao.org/aos/agrovoc/c_6774 03 medical and health sciences Cluster Analysis Open Source U10 - Informatique, mathématiques et statistiques DATA processing & computer science http://aims.fao.org/aos/agrovoc/c_1513 amplicon clustering séquence d'adn échantillonnage Applications Notes 004 technique analytique http://aims.fao.org/aos/agrovoc/c_f0eb96ed séquence répétée [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM] Biologie info:eu-repo/classification/ddc/004 http://aims.fao.org/aos/agrovoc/c_7631 Software
DOI: 10.1093/bioinformatics/btab493 Publication Date: 2021-07-01T12:30:15Z
ABSTRACT
Abstract Motivation Previously we presented swarm, an open-source amplicon clustering programme that produces fine-scale molecular operational taxonomic units (OTUs) that are free of arbitrary global clustering thresholds. Here, we present swarm v3 to address issues of contemporary datasets that are growing towards tera-byte sizes. Results When compared with previous swarm versions, swarm v3 has modernized C++ source code, reduced memory footprint by up to 50%, optimized CPU-usage and multithreading (more than 7 times faster with default parameters), and it has been extensively tested for its robustness and logic. Availability and implementation Source code and binaries are available at https://github.com/torognes/swarm. Supplementary information Supplementary data are available at Bioinformatics online.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (19)
CITATIONS (75)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....