Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations

Imputation (statistics) Minor allele frequency Genome-wide Association Study 1000 Genomes Project Linkage Disequilibrium Genetic Association International HapMap Project Genetic architecture
DOI: 10.1371/journal.pgen.1008500 Publication Date: 2019-12-23T13:37:01Z
ABSTRACT
Most genome-wide association and fine-mapping studies to date have been conducted in individuals of European descent, genetic populations Hispanic/Latino African ancestry are limited. In addition, these more complex linkage disequilibrium structure. order better define the architecture understudied populations, we leveraged >100,000 phased sequences available from deep-coverage whole genome sequencing through multi-ethnic NHLBI Trans-Omics for Precision Medicine (TOPMed) program impute genotypes into admixed samples with genotyping array data. We demonstrated that using TOPMed data as imputation reference panel improves genotype quality which subsequently enhanced gene-mapping power traits. For rare variants minor allele frequency (MAF) < 0.5%, observed a 2.3- 6.1-fold increase number well-imputed variants, 11–34% improvement average quality, compared state-of-the-art 1000 Genomes Project Phase 3 Haplotype Reference Consortium panels. Impressively, even extremely count <10 (including singletons) target samples, information content rescued was >86%. Subsequent analyses panel-imputed hematological traits (hemoglobin (HGB), hematocrit (HCT), white blood cell (WBC)) ~21,600 African-ancestry ~21,700 identified associations two HBB gene (rs33930165 higher WBC [p = 8.8x10-15] rs11549407 lower HGB 1.5x10-12] HCT 8.8x10-10] Hispanics/Latinos). By comparison, neither variant would significant if either or panels had used imputation. Our findings highlight utility identification novel not previously detected similarly sized under-represented populations.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (62)
CITATIONS (233)