- Genetic diversity and population structure
- Evolution and Genetic Dynamics
- Genetic Mapping and Diversity in Plants and Animals
- Genetic and phenotypic traits in livestock
- Genomics and Phylogenetic Studies
- Wildlife Ecology and Conservation
- Ecology and Vegetation Dynamics Studies
- Genetic Associations and Epidemiology
- Forensic and Genetic Research
- Wildlife-Road Interactions and Conservation
- Chromosomal and Genetic Variations
- Mathematical and Theoretical Epidemiology and Ecology Models
- Gene expression and cancer classification
- Animal Ecology and Behavior Studies
- Evolution and Paleontology Studies
- Genetic and Clinical Aspects of Sex Determination and Chromosomal Abnormalities
- Species Distribution and Climate Change
- Bayesian Methods and Mixture Models
- Cancer Genomics and Diagnostics
- Animal Behavior and Reproduction
- Evolutionary Game Theory and Cooperation
- Stochastic processes and statistical mechanics
- Avian ecology and behavior
- Morphological variations and asymmetry
- Gene Regulatory Network Analysis
University of Oregon
2015-2024
University of Southern California
2013-2021
Michigan State University
2018
University of California, Davis
2010-2015
Ambrose University
2013
The Nature Conservancy
2011
University of California, Berkeley
2008-2011
The recent genealogical history of human populations is a complex mosaic formed by individual migration, large-scale population movements, and other demographic events. Population genomics datasets can provide window into this history, as rare traces shared genetic ancestry are detectable due to long segments genomic material. We make use data for 2,257 Europeans (in the Reference Sample [POPRES] dataset) conduct one first surveys over past 3,000 years at continental scale. detected 1.9...
Abstract Stochastic simulation is a key tool in population genetics, since the models involved are often analytically intractable and usually only way of obtaining ground-truth data to evaluate inferences. Because this, large number specialized programs have been developed, each filling particular niche, but with largely overlapping functionality substantial duplication effort. Here, we introduce msprime version 1.0, which efficiently implements ancestry mutation simulations based on...
A classic problem in population genetics is the characterization of discrete structure presence continuous patterns genetic differentiation. Especially when sampling discontinuous, use clustering or assignment methods may incorrectly ascribe differentiation due to processes (e.g., geographic isolation by distance) processes, such as geographic, ecological, reproductive barriers between populations. This reflects a shortcoming current for inferring and visualizing applied data deriving from...
Populations can be genetically isolated both by geographic distance and differences in their ecology or environment that decrease the rate of successful migration. Empirical studies often seek to investigate relationship between genetic differentiation some ecological variable(s) while accounting for distance, but common approaches this problem (such as partial Mantel test) have a number drawbacks. In article, we present Bayesian method enables users quantify relative contributions sampled...
Phylogenetic comparative methods may fail to produce meaningful results when either the underlying model is inappropriate or data contain insufficient information inform inference. The ability measure statistical power of these has become crucial ensure that quantity keeps pace with growing complexity. Through simulations, we show commonly applied choice based on criteria can have remarkably high error rates; this be a problem because estimate uncertainty are not widely known applied....
Abstract There is an increasing demand for evolutionary models to incorporate relatively realistic dynamics, ranging from selection at many genomic sites complex demography, population structure, and ecological interactions. Such can generally be implemented as individual‐based forward simulations, but the large computational overhead of these often makes simulation whole chromosome sequences in populations infeasible. This situation presents important obstacle field that requires conceptual...
Abstract Principal component analysis (PCA) is often used to describe overall population structure—patterns of relatedness arising from past demographic history—among a set genomes. Here, Li and Ralph how the patterns uncovered by.... Population structure leads systematic in measures mean between individuals large genomic data sets, which are discovered visualized using dimension reduction techniques such as principal (PCA). Mean an average relationships across locus-specific genealogical...
In this paper we describe how to efficiently record the entire genetic history of a population in forwards-time, individual-based genetics simulations with arbitrary breeding models, structure and demography. This approach dramatically reduces computational burden tracking individual genomes by allowing us simulate only those loci that may affect reproduction (those having non-neutral variants). The is recorded as succinct tree sequence introduced software package msprime, on which neutral...
The explosion in population genomic data demands ever more complex modes of analysis, and increasingly, these analyses depend on sophisticated simulations. Recent advances genetic simulation have made it possible to simulate large models, but specifying such models for a particular engine remains difficult error-prone task. Computational genetics researchers currently re-implement independently, leading inconsistency duplication effort. This situation presents major barrier empirical seeking...
Abstract Models for detecting the effect of adaptation on population genomic diversity are often predicated a single newly arisen mutation sweeping rapidly to fixation. However, can also adapt new environment by multiple mutations similar phenotypic that arise in parallel, at same locus or different loci. These each quickly reach intermediate frequency, preventing any one from fixation globally, leading “soft” sweep population. Here we study various models parallel continuous, geographically...
Speciation genomic studies aim to interpret patterns of genome-wide variation in light the processes that give rise new species. However, interpreting "landscape" speciation is difficult, because many evolutionary can impact levels variation. Facilitated by first chromosome-level assembly for group, we use whole-genome sequencing and simulations shed on have shaped landscape during a radiation monkeyflowers. After inferring phylogenetic relationships among 9 taxa this radiation, show highly...
Geographic patterns of genetic variation within modern populations, produced by complex histories migration, can be difficult to infer and visually summarize. A general consequence geographically limited dispersal is that samples from nearby locations tend more closely related than distant locations, so covariance often recapitulates geographic proximity. We use genome-wide polymorphism data build "geogenetic maps," which, when applied stationary produces a map the positions but with...
Abstract Real geography is continuous, but standard models in population genetics are based on discrete, well-mixed populations. As a result, many methods of analyzing genetic data assume that samples random draw from population, applied to clustered populations structured clinally over space. Here, we use simulations living continuous study the impacts dispersal and sampling strategy summary statistics, demographic inference, genome-wide association studies (GWAS). We find most common...
Most organisms are more closely related to nearby than distant members of their species, creating spatial autocorrelations in genetic data. This allows us predict the location origin a sample by comparing it set samples known geographic origin. Here, we describe deep learning method, which call Locator, accomplish this task faster and accurately existing approaches. In simulations, Locator infers within 4.1 generations dispersal runs at least an order magnitude recent model-based approach....
Abstract As a genetic mutation is passed down across generations, it distinguishes those genomes that have inherited from not, providing glimpse of the genealogical tree relating to each other at site. Statistical summaries variation therefore also describe underlying genealogies. We use this correspondence define general framework efficiently computes single-site population statistics using succinct sequence encoding genealogies and genome sequence. The approach accumulates sample weights...
The geographic nature of biological dispersal shapes patterns genetic variation over landscapes, making it possible to infer properties from data. Here, we present an inference tool that uses geographically distributed genotype data in combination with a convolutional neural network estimate critical population parameter: the mean per-generation distance. Using extensive simulation, show our deep learning approach is competitive or outperforms state-of-the-art methods, particularly at small...
Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains major obstacle. Today, more realistic are possible thanks to large increases quantity quality available genetic data, sophistication inference simulation software. However, implementing these still requires substantial time specialized knowledge. These challenges especially pronounced simulating genomes...
We introduce a broad class of mechanistic spatial models to describe how spatially heterogeneous populations live, die, and reproduce. Individuals are represented by points point measure, whose birth death rates can depend both on position local population density, defined at location be the convolution measure with suitable non-negative integrable kernel centred that location. pass three different scaling limits: an interacting superprocess, nonlocal partial differential equation (PDE),...
Abstract Two major sources of stochasticity in the dynamics neutral alleles result from resampling finite populations (genetic drift) and random genetic background nearby selected on which are found (linked selection). There is now good evidence that linked selection plays an important role shaping polymorphism levels a number species. One best-investigated models recurrent full-sweep model, newly arisen fix rapidly. However, bulk sweep into population may not be destined for rapid fixation....
Recent genomic studies have highlighted the important role of admixture in shaping genome-wide patterns diversity. Past leaves a population signature linkage disequilibrium (LD), reflecting mixing parental chromosomes by segregation and recombination. These LD can be used to infer timing admixture, but results inference depend strongly on assumed demographic model. Here, we introduce theoretical framework for modeling geographic contact zone where two differentiated populations come into are...
Convergent evolution is the independent of similar traits in different species or lineages same species; this often a result adaptation to environments, process referred as convergent adaptation. We investigate here molecular basis maize highland climates Mesoamerica and South America, using genome-wide SNP data. Taking advantage archaeological data on arrival highlands, we infer demographic models for both populations, identifying evidence strong bottleneck rapid expansion America. use...
The extent to which populations experiencing shared selective pressures adapt through a genetic response is relevant many questions in evolutionary biology. In this article, we explore how standing variation contributes convergent responses geographically spread population. Geographically limited dispersal slows the of each selected allele, hence allowing other alleles before any one comes dominate When selectively equivalent meet, their progress substantially slowed, dividing species range...
Abstract Since the autosomal genome is shared between sexes, sex-specific fitness optima present an evolutionary challenge. While sexually antagonistic selection might favor different alleles within females and males, segregation randomly reassorts at loci sexes each generation. This process of homogenization during transmission thus prevents between-sex allelic divergence generated by from accumulating across multiple generations. However, recent empirical studies have reported high...
Abstract Hybrid zones formed between recently diverged populations offer an opportunity to study the mechanisms underlying reproductive isolation and process of speciation. Here, we use a combination analytical theory explicit forward simulations describe how selection against hybrid genotypes impacts patterns introgression across genomic geographic space. By describing lineages move zone, in model without coalescence, add modern understanding clines form parental haplotypes are broken up...