Discovery of tandem and interspersed segmental duplications using high-throughput sequencing

Segmental duplication Sequence (biology)
DOI: 10.1093/bioinformatics/btz237 Publication Date: 2019-03-29T12:10:32Z
ABSTRACT
Abstract Motivation Several algorithms have been developed that use high-throughput sequencing technology to characterize structural variations (SVs). Most of the existing approaches focus on detecting relatively simple types SVs such as insertions, deletions and short inversions. In fact, complex are crucial importance several associated with genomic disorders. To better understand contribution human disease, we need new accurately discover genotype variants. Additionally, due similar signatures, inverted duplications or gene conversion events include segmental often characterized inversions, likewise, conversions in direct orientation may be called deletions. Therefore, there is still a for accurate fully thus improve calling accuracy more Results We novel tandem, interspersed using read whole genome datasets. integrated these methods our TARDIS tool, which now capable various multiple sequence signatures pair, depth split read. evaluated prediction performance through experiments both simulated real simulation experiments, 30× coverage achieved 96% sensitivity only 4% false discovery rate. For involve data, used two haploid genomes (CHM1 CHM13) one (NA12878) from Illumina Platinum Genomes set. Comparison results orthogonal PacBio call sets same revealed higher than state-of-the-art methods. Furthermore, showed surprisingly low rate approach CHM1 (<5% top 50 predictions). Availability implementation source code available at https://github.com/BilkentCompGen/tardis, corresponding Docker image https://hub.docker.com/r/alkanlab/tardis/. Supplementary information data Bioinformatics online.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (45)
CITATIONS (32)