Estimating Effect Sizes and Expected Replication Probabilities from GWAS Summary Statistics

Genome-wide Association Study Linkage Disequilibrium SNP Missing heritability problem Genetic Association Replication
DOI: 10.3389/fgene.2016.00015 Publication Date: 2016-02-16T06:23:18Z
ABSTRACT
Genome-wide Association Studies (GWAS) result in millions of summary statistics ("z-scores") for single nucleotide polymorphism (SNP) associations with phenotypes. These rich datasets afford deep insights into the nature and extent genetic contributions to complex phenotypes such as psychiatric disorders, which are understood have substantial components that arise from very large numbers SNPs. The complexity datasets, however, poses a significant challenge maximizing their utility. This is reflected need better understanding landscape z-scores, knowledge would enhance causal SNP gene discovery, help elucidate mechanistic pathways, inform future study design. Here we present parsimonious methodology modeling effect sizes replication probabilities, relying only on GWAS substudies, scheme allowing direct empirical validation. We show z-scores mixture Gaussians conceptually appropriate, particular taking account ubiquitous non-null effects likely due weak linkage disequilibrium four-parameter model allows estimating degree polygenicity phenotype predicting proportion chip heritability explainable by genome-wide SNPs studies larger sample sizes. apply recent schizophrenia (N = 82,315) putamen volume 12,596), approximately 9.3 million both cases. that, over broad range sizes, accurately predicts expectation estimates true probabilities multistage designs. assess over-estimated when based linear-regression association coefficients. estimate be 0.037 0.001, while respective required approach fully explaining 10(6) 10(5). can extended incorporate prior pleiotropy annotation. current findings suggest applicable array will architectures.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (55)
CITATIONS (32)