Systematic prediction and validation of breakpoints associated with copy-number variants in the human genome

Breakpoint Comparative genomic hybridization Structural Variation Segmental duplication Sequence (biology)
DOI: 10.1073/pnas.0703834104 Publication Date: 2007-06-06T02:10:01Z
ABSTRACT
Copy-number variants (CNVs) are an abundant form of genetic variation in humans. However, approaches for determining exact CNV breakpoint sequences (physical deletion or duplication boundaries) across individuals, crucial associating genotype to phenotype, have been lacking so far, and the vast majority CNVs reported with approximate genomic coordinates only. Here, we report approach, called BreakPtr, fine-mapping (available from http://breakptr.gersteinlab.org). We statistically integrate both sequence characteristics data high-resolution comparative genome hybridization experiments a discrete-valued, bivariate hidden Markov model. Incorporation nucleotide-sequence information allows us take into account fact that recently duplicated (e.g., segmental duplications) often coincide breakpoints. In anticipation upcoming increase data, developed iterative, "active" approach initially scoring preliminary model, performing targeted validations, retraining then rescoring, flexible parameterization system intuitively collapses full model 2,503 parameters core one only 10. Using our accurately mapped >400 breakpoints on chromosome 22 region 11, refining boundaries many previously approximately CNVs. Four predicted flanked known disease-associated deletions. validated additional four by sequencing. Overall, results suggest predictive resolution 300 bp. This level enables more precise correlations between individuals than possible, allowing study population frequencies. Further, it enabled demonstrate clear Mendelian pattern inheritance
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (34)
CITATIONS (67)