ARKS: chromosome-scale scaffolding of human genome drafts with linked read kmers

Sequence assembly
DOI: 10.1186/s12859-018-2243-x Publication Date: 2018-06-20T12:31:34Z
ABSTRACT
The long-range sequencing information captured by linked reads, such as those available from 10× Genomics (10xG), helps resolve genome sequence repeats, and yields accurate contiguous draft assemblies. We introduce ARKS, an alignment-free read scaffolding methodology that uses reads to organize assemblies further into drafts. Our approach departs other alignment-dependent scaffolders, including our own (ARCS), a kmer-based mapping approach. kmer strategy has several advantages over alignment methods, better usability faster processing, it precludes the need for input formatting assembly indexing. reliance on kmers instead of alignments pairing sequences relaxes workflow requirements, drastically reduces run time.Here, we show how when used in conjunction with Hi-C data scaffolding, improve human PacBio long-read five-fold (baseline vs. ARKS NG50 = 4.6 23.1 Mbp, respectively). also demonstrate method provides improvements megabase-scale Supernova (NG50 14.74 Mbp 25.94 before after ARKS), which itself exclusively assembly, execution speed six nine times than competitive scaffolders (~ 10.5 h compared 75.7 h, average). Following 10xG (of cell line NA12878), fewer 9 scaffolds cover each chromosome, except largest (chromosome 1, n 13).ARKS record associate barcode needed order orient sequences. simplified workflow, initial implementation, ARCS, markedly improves time performances experimental datasets. Furthermore, novel distance estimator utilizes barcoding estimate gap sizes. It accomplishes this modeling relationship between known distances region within contigs calculating associated Jaccard indices. potential provide correct, chromosome-scale assemblies, promptly. expect have broad utility helping refine genomes.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (25)
CITATIONS (82)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....