MetaEuk—sensitive, high-throughput gene discovery, and annotation for large-scale eukaryotic metagenomics

Eukaryotes Annotation 0206 medical engineering 02 engineering and technology Microbial ecology 03 medical and health sciences MetaEuk Databases, Genetic Contigs Homology detection 0303 health sciences Research Microbiota QR100-130 Computational Biology Eukaryota Molecular Sequence Annotation Sequence Analysis, DNA High-Throughput Screening Assays 3. Good health Metagenome Metagenomics Prediction Algorithms
DOI: 10.1186/s40168-020-00808-x Publication Date: 2020-04-03T14:02:49Z
ABSTRACT
Metagenomics is revolutionizing the study of microorganisms and their involvement in biological, biomedical, geochemical processes, allowing us to investigate by direct sequencing a tremendous diversity organisms without need for prior cultivation. Unicellular eukaryotes play essential roles most microbial communities as chief predators, decomposers, phototrophs, bacterial hosts, symbionts, parasites plants animals. Investigating therefore great interest ecology, biotechnology, human health, evolution. However, generally lower coverage, more complex gene genome architectures, lack eukaryote-specific experimental computational procedures have kept them on sidelines metagenomics.MetaEuk toolkit high-throughput, reference-based discovery, annotation protein-coding genes eukaryotic metagenomic contigs. It performs fast searches with 6-frame-translated fragments covering all possible exons optimally combines matches into multi-exon proteins. We used benchmark seven diverse, annotated genomes show that MetaEuk highly sensitive even under conditions low sequence similarity reference database. To demonstrate MetaEuk's power discover novel proteins large-scale data, we assembled contigs from 912 samples Tara Oceans project. predicted >12,000,000 8 days ten 16-core servers. Most discovered are diverged known originate very sparsely sampled supergroups.The open-source (GPLv3) software (https://github.com/soedinglab/metaeuk) enables metagenomics through reference-based, taxonomic functional annotation. Video abstract.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (62)
CITATIONS (181)