Accounting for long-range correlations in genome-wide simulations of large cohorts

Coalescent theory Identity by descent Linkage Disequilibrium Demographic history
DOI: 10.1371/journal.pgen.1008619 Publication Date: 2020-05-05T17:46:30Z
ABSTRACT
Coalescent simulations are widely used to examine the effects of evolution and demographic history on genetic makeup populations. Thanks recent progress in algorithms data structures, simulators such as widely-used msprime now provide genome-wide for millions individuals. However, this software relies classic coalescent theory its assumptions that sample sizes small region being simulated is short. Here we show long regions genome exhibit large biases identity-by-descent (IBD), long-range linkage disequilibrium (LD), ancestry patterns, particularly when size large. We present a Wright-Fisher extension msprime, it produces more realistic distributions IBD, LD, proportions, while also addressing subtle coalescent. Further, these extensions computationally efficient than state-of-the-art simulating regions, including whole-genome data. For shorter efficiency can be maintained via hybrid model which simulates past under uses distant past.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (31)
CITATIONS (56)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....