Andrew G. Clark
- Genetic diversity and population structure
- Evolution and Genetic Dynamics
- Genetic Associations and Epidemiology
- Animal Behavior and Reproduction
- Chromosomal and Genetic Variations
- Genomics and Phylogenetic Studies
- Insect symbiosis and bacterial influences
- Genetic Mapping and Diversity in Plants and Animals
- Genetic and Clinical Aspects of Sex Determination and Chromosomal Abnormalities
- CRISPR and Genetic Engineering
- Insect and Arachnid Ecology and Behavior
- Neurobiology and Insect Physiology Research
- Plant and animal studies
- RNA and protein synthesis mechanisms
- Genetic and phenotypic traits in livestock
- Invertebrate Immune Response Mechanisms
- Epigenetics and DNA Methylation
- Bioinformatics and Genomic Networks
- Insect Resistance and Genetics
- Gene expression and cancer classification
- Genomics and Rare Diseases
- Genomics and Chromatin Dynamics
- Insect-Plant Interactions and Control
- Genetic Syndromes and Imprinting
- Physiological and biochemical adaptations
Cornell University
2016-2025
University of Colorado Denver
2024
Pediatrics and Genetics
2010-2023
University of Tübingen
2022-2023
University of Stuttgart
2022-2023
Tri-Institutional PhD Program in Chemical Biology
2019-2021
Weill Cornell Medicine
2019-2021
Pennsylvania State University
1996-2020
University of California, San Francisco
2002-2019
Binghamton University
2014
The 1000 Genomes Project set out to provide a comprehensive description of common human genetic variation by applying whole-genome sequencing diverse individuals from multiple populations. Here we report completion the project, having reconstructed genomes 2,504 26 populations using combination low-coverage sequencing, deep exome and dense microarray genotyping. We characterized broad spectrum variation, in total over 88 million variants (84.7 single nucleotide polymorphisms (SNPs), 3.6...
A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion human genome was generated by whole-genome shotgun sequencing method. The 14.8-billion bp DNA over 9 months from 27,271,853 high-quality reads (5.11-fold coverage genome) both ends plasmid clones made five individuals. Two assembly strategies—a and a regional chromosome assembly—were used, each combining data Celera publicly funded effort. public were shredded into 550-bp segments to create 2.9-fold those regions...
Genetic and Phenotypic Variation.- Organisation of Random Drift.- Mutation the Neutral Theory.- Darwinian Selection.- Inbreeding, Population Subdivision, Migration.- Molecular Genetics.- Evolutionary Quantitative Genomics.- Human Genetics.
The accelerating pace of genome sequencing throughout the tree life is driving need for improved unsupervised annotation components such as transposable elements (TEs). Because types and sequences TEs are highly variable across species, automated TE discovery challenging time-consuming tasks. A critical first step de novo identification accurate compilation sequence models representing all unique families dispersed in genome. Here we introduce RepeatModeler2, a pipeline that greatly...
Anopheles gambiae is the principal vector of malaria, a disease that afflicts more than 500 million people and causes 1 deaths each year. Tenfold shotgun sequence coverage was obtained from PEST strain A. assembled into scaffolds span 278 base pairs. A total 91% genome organized in 303 scaffolds; largest scaffold 23.1 There substantial genetic variation within this strain, apparent existence two haplotypes approximately equal frequency ("dual haplotypes") fraction likely reflects outbred...
Detecting selective sweeps from genomic SNP data is complicated by the intricate ascertainment schemes used to discover SNPs, and confounding influence of underlying complex demographics varying mutation recombination rates. Current methods for detecting have little or no robustness demographic assumptions rates, provide method correcting biases. Here, we present several new tests aimed at data. Using extensive simulations, show that a parametric test, based on composite likelihood, has high...
Since the divergence of humans and chimpanzees about 5 million years ago, these species have undergone a remarkable evolution with drastic in anatomy cognitive abilities. At molecular level, despite small overall magnitude DNA sequence divergence, we might expect such evolutionary changes to leave noticeable signature throughout genome. We here compare 13,731 annotated genes from their chimpanzee orthologs identify that show evidence positive selection. Many present selection tend be...
Human mtDNA shows striking regional variation, traditionally attributed to genetic drift. However, it is not easy account for the fact that only two lineages (M and N) left Africa colonize Eurasia A, C, D, G show a 5-fold enrichment from central Asia Siberia. As an alternative drift, natural selection might have enriched certain as people migrated north into colder climates. To test this hypothesis we analyzed 104 complete sequences all global regions lineages. African variation did...
Quantifying the distribution of fitness effects among newly arising mutations in human genome is key to resolving important debates medical and evolutionary genetics. Here, we present a method for inferring this using Single Nucleotide Polymorphism (SNP) data from population with non-stationary demographic history (such as that modern humans). Application our 47,576 coding SNPs found by direct resequencing 11,404 protein coding-genes 35 individuals (20 European Americans 15 African...
High-throughput sequencing technology enables population-level surveys of human genomic variation. Here, we examine the joint allele frequency distributions across continental populations and present an approach for combining complementary aspects whole-genome, low-coverage data targeted high-coverage data. We apply this to generated by pilot phase Thousand Genomes Project, including whole-genome 2–4× coverage 179 samples from HapMap European, Asian, African panels as well target exons 800...
Even though human and chimpanzee gene sequences are nearly 99% identical, sequence comparisons can nevertheless be highly informative in identifying biologically important changes that have occurred since our ancestral lineages diverged. We analyzed alignments of 7645 to their mouse orthologs. These three-species allowed us identify genes undergoing natural selection along the chimp lineage by fitting models include parameters specifying rates synonymous nonsynonymous nucleotide...
The composition of bacteria in and on the human body varies widely across individuals, has been associated with multiple health conditions. While microbial communities are influenced by environmental factors, some degree genetic influence host microbiome is also expected. This study part an expanding effort to comprehensively profile interactions between variation this ecosystem a genome- microbiome-wide scale.
The frequencies of low-activity alleles glucose-6-phosphate dehydrogenase in humans are highly correlated with the prevalence malaria. These “deficiency” thought to provide reduced risk from infection by Plasmodium parasite and maintained at high frequency despite hemopathologies that they cause. Haplotype analysis “A−” ”Med“ mutations this locus indicates have evolved independently increased a rate is too rapid be explained random genetic drift. Statistical modeling A− allele arose within...
Human populations have experienced recent explosive growth, expanding by at least three orders of magnitude over the past 400 generations. This departure from equilibrium skews patterns genetic variation and distorts basic principles population genetics. We characterized empirical signatures growth on site frequency spectrum found that discrepancy in rare variant abundance across demographic modeling studies is mostly due to differences sample size. Rapid increases load variants likely play...
Identifying genomic locations that have experienced selective sweeps is an important first step toward understanding the molecular basis of adaptive evolution. Using statistical methods account for confounding effects population demography, recombination rate variation, and single-nucleotide polymorphism ascertainment, while also providing fine-scale estimates position selected site, we analyzed a dataset 1.2 million human polymorphisms genotyped in African-American, European-American,...
Sequence comparisons of genomes or expressed sequence tags (ESTs) from related organisms provide insight into functional conservation and diversification. We compare the sequences ESTs male accessory gland Drosophila simulans to their orthologs in its close relative melanogaster , demonstrate rapid divergence many these reproductive genes. Nineteen (∼11%) 176 independent genes identified EST screen contain protein-coding regions with an excess nonsynonymous over synonymous changes,...