Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations

Sophistication Benchmarking
DOI: 10.7554/elife.84874.2 Publication Date: 2023-05-23T19:13:46Z
ABSTRACT
Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic data sets remains major obstacle. Today, more realistic are possible thanks to large increases quantity quality available genetic data, sophistication inference simulation software. However, implementing these still requires substantial time specialized knowledge. These challenges especially pronounced simulating genomes species not well-studied, since it always clear what information required produce with level realism sufficient confidently answer given question. The community-developed framework <monospace>stdpopsim</monospace> seeks lower this barrier by facilitating complex models using up-to-date information. initial version focused on establishing six well-characterized model (Adrion et al., 2020). Here, we report improvements made new release (version 0.2), which includes significant expansion catalog additions capabilities. Features added improve simulated include non-crossover recombination provision species-specific annotations. Through community-driven efforts, expanded number than three-fold broadened coverage across tree life. During process expanding catalog, have identified common sticking points developed best practices setting up genome-scale simulations. We describe input generating simulation, suggest good obtaining relevant from literature, discuss pitfalls considerations. aim further promote use whole-genome simulations, non-model organisms, making them available, transparent, accessible everyone.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (88)
CITATIONS (1)