High-quality draft assemblies of mammalian genomes from massively parallel sequence data

Hybrid genome assembly Massive parallel sequencing Sequence assembly Sequence (biology)
DOI: 10.1073/pnas.1017351108 Publication Date: 2010-12-28T01:45:14Z
ABSTRACT
Massively parallel DNA sequencing technologies are revolutionizing genomics by making it possible to generate billions of relatively short (~100-base) sequence reads at very low cost. Whereas such data can be readily used for a wide range biomedical applications, has proven difficult use them high-quality de novo genome assemblies large, repeat-rich vertebrate genomes. To date, the generated from have fallen far those obtained with older (but much more expensive) capillary-based approach. Here, we report development an algorithm assembly, ALLPATHS-LG, and its application massively human mouse genomes, on Illumina platform. The resulting draft good accuracy, short-range contiguity, long-range connectivity, coverage genome. In particular, base accuracy is high (≥99.95%) scaffold sizes (N50 size = 11.5 Mb 7.2 mouse) approach sequencing. combination improved technology computational methods should now make increase dramatically large ALLPATHS-LG program available http://www.broadinstitute.org/science/programs/genome-biology/crd .
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (25)
CITATIONS (1383)