NFDI4DS | UHH-SEMS - Publication Details

BigFiRSt: A Software Program Using Big Data Technique for Mining Simple Sequence Repeats From Large-Scale Sequencing Data

Merge (version control)

DOI: 10.3389/fdata.2021.727216 Publication Date: 2022-01-18T11:31:53Z

Abstract Supplemental Material References Cited by

AUTHORS (10)

Jinxiang Chen

Fuyi Li

Miao Wang

Junlong Li

Tatiana T. Marque...

André Leier

Jerico Revote

Shuqin Li

Quanzhong Liu

Jiangning Song

ABSTRACT

Simple Sequence Repeats (SSRs) are short tandem repeats of nucleotide sequences. It has been shown that SSRs associated with human diseases and medical relevance. Accordingly, a variety computational methods have proposed to mine from genomes. Conventional rely on high-quality complete genome identify SSRs. However, the sequenced often misses several highly repetitive regions. Moreover, many non-model species no entire With recent advances next-generation sequencing (NGS) techniques, large-scale sequence reads for any can be rapidly generated using NGS. In this context, number thousands SSR loci within large amounts species. While most commonly used NGS platforms (e.g., Illumina platform) market generally provide paired-end reads, merging overlapping become common way prior identification loci. This posed big data analysis challenge traditional stand-alone tools merge read pairs data.In study, we present new Hadoop-based software program, termed BigFiRSt, address problem cutting-edge technology. BigFiRSt consists two major modules, BigFLASH BigPERF, implemented based state-of-the-art tools, FLASH PERF, respectively. BigPERF mining in manner, Comprehensive benchmarking experiments show dramatically reduce execution times fast very DNA data.The excellent performance mainly resorts Big Data Hadoop technology parallel distributed computing clusters. We anticipate will valuable tool coming biological era.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (117)

CITATIONS (3)

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products CROSSREF - Publications

PlumX Metrics

BigFiRSt: A Software Program Using Big Data Technique for Mining Simple Sequence Repeats From Large-Scale Sequencing Data

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....