Jonathan Mitchell
- Cancer Genomics and Diagnostics
- Genomics and Rare Diseases
- Genetic diversity and population structure
- Genetic Associations and Epidemiology
- Genomics and Phylogenetic Studies
- Genetic factors in colorectal cancer
- Epigenetics and DNA Methylation
- Acute Myeloid Leukemia Research
- Myeloproliferative Neoplasms: Diagnosis and Treatment
- Gene expression and cancer classification
- Ecology and Vegetation Dynamics Studies
- Genetic Mapping and Diversity in Plants and Animals
- Ferroptosis and cancer prognosis
- Bioinformatics and Genomic Networks
- Pancreatic and Hepatic Oncology Research
- Cancer Immunotherapy and Biomarkers
- Evolution and Genetic Dynamics
- Telomeres, Telomerase, and Senescence
- CRISPR and Genetic Engineering
- Evolution and Paleontology Studies
- Genetics and Neurodevelopmental Disorders
- Bayesian Methods and Mixture Models
- Species Distribution and Climate Change
- Genetic Syndromes and Imprinting
- Chromatin Remodeling and Cancer
Genomics (United Kingdom)
2024-2025
AstraZeneca (United Kingdom)
2022-2025
Genomics England
2022-2024
University of Tasmania
2018-2024
ARC Centre of Excellence for Plant Success in Nature and Agriculture
2024
William Harvey Research Institute
2022-2024
Queen Mary University of London
2022-2024
Kennedy Krieger Institute
2023
University of Alaska Fairbanks
2018-2022
AstraZeneca (Brazil)
2022
Abstract Clonal hematopoiesis (CH), the clonal expansion of a blood stem cell and its progeny driven by somatic driver mutations, affects over third people, yet remains poorly understood. Here we analyze genetic data from 200,453 UK Biobank participants to map landscape inherited predisposition CH, increasing number germline associations with CH in European-ancestry populations 4 14. Genes at new loci implicate DNA damage repair ( PARP1 , ATM CHEK2 ), hematopoietic migration/homing CD164 )...
Abstract The Cancer Programme of the 100,000 Genomes Project was an initiative to provide whole-genome sequencing (WGS) for patients with cancer, evaluating opportunities precision cancer care within UK National Healthcare System (NHS). Genomics England, alongside NHS analyzed WGS data from 13,880 solid tumors spanning 33 types, integrating genomic real-world treatment and outcome data, a secure Research Environment. Incidence somatic mutations in genes recommended standard-of-care testing...
Integrating human genomics and proteomics can help elucidate disease mechanisms, identify clinical biomarkers discover drug targets1-4. Because previous proteogenomic studies have focused on common variation via genome-wide association studies, the contribution of rare variants to plasma proteome remains largely unknown. Here we associations between protein-coding 2,923 protein abundances measured in 49,736 UK Biobank individuals. Our variant-level exome-wide study identified 5,433...
Abstract Colorectal carcinoma (CRC) is a common cause of mortality 1 , but comprehensive description its genomic landscape lacking 2–9 . Here we perform whole-genome sequencing 2,023 CRC samples from participants in the UK 100,000 Genomes Project, thereby providing highly detailed somatic mutational this cancer. Integrated analyses identify more than 250 putative driver genes, many not previously implicated or other cancers, including several recurrent changes outside coding genome. We...
Highlights•ITSN1 haploinsufficiency confers a ∼10-fold increased risk of Parkinson's disease (PD)•Effect size surpasses other well-established loci, including GBA1 and LRRK2•In vivo in vitro studies suggest an interaction between ITSN1 α-synuclein•Findings implicate synaptic vesicle trafficking dysfunction PD pathogenesisSummaryDespite its significant heritability, the genetic basis (PD) remains incompletely understood. Here, analyzing whole-genome sequence data from 3,809 cases 247,101...
Telomeres protect chromosome ends from damage and their length is linked with human disease aging. We developed a joint telomere metric, combining quantitative PCR whole-genome sequencing measurements 462,666 UK Biobank participants. This metric increased SNP heritability, suggesting that it better captures genetic regulation of length. Exome-wide rare-variant gene-level collapsing association studies identified 64 variants 30 genes significantly associated length, including allelic series...
MSCquartets is an R package for species tree hypothesis testing, inference of trees and networks under the Multispecies Coalescent model incomplete lineage sorting its network analog. Input these analyses are collections metric or topological locus which then summarized by quartets displayed on them. Results tests at user-supplied levels in a simplex plot color-coded points. The implements QDC WQDC algorithms inference, NANUQ algorithm level-1 all give statistically consistent estimators...
We performed collapsing analyses on 454,796 UK Biobank (UKB) exomes to detect gene-level associations with diabetes. Recessive carriers of nonsynonymous variants in MAP3K15 were 30% less likely develop diabetes ( P = 5.7 × 10 −10 ) and had lower glycosylated hemoglobin (β −0.14 SD units, 1.1 −24 ). These independent body mass index, suggesting protection against insulin resistance even the setting obesity. replicated these findings 96,811 Admixed Americans Mexico City Prospective Study <...
Abstract A simple graphical device, the simplex plot of quartet concordance factors, is introduced to aid in exploration a collection gene trees on common set taxa. single summarizes all tree discord and allows for visual comparison expected from multispecies coalescent model (MSC) incomplete lineage sorting species tree. formal statistical procedure described that can quantify deviation expectation each subset four taxa, suggesting when data are not accord with MSC, thus either inference...
Abstract Genetic variants in chromatin regulators are frequently found neurodevelopmental disorders, but their effect disease etiology is rarely determined. Here, we uncover and functionally define pathogenic the modifier EZH1 as cause of dominant recessive disorders 19 individuals. encodes one two alternative histone H3 lysine 27 methyltransferases PRC2 complex. Unlike other subunits, which involved cancers developmental syndromes, implication human development largely unknown. Using...
Abstract Background Findings from previous gastric cancer microbiome studies have been conflicting, potentially due to patient and/or tumor heterogeneity. The intratumoral and its relationship with clinicopathological variables not yet characterized in detail. We hypothesized that variation microbial abundance, alpha diversity, composition is related characteristics. Methods Metagenomic analysis of 529 GC samples was performed, including whole exome sequencing data Cancer Genome Atlas (TCGA)...
Abstract The unexpected contamination of normal samples with tumour cells reduces variant detection sensitivity, compromising downstream analyses in canonical tumour-normal analyses. Leveraging whole-genome sequencing data available at Genomics England, we develop a tool for sample assessment, which validate silico and against minimal residual disease testing. From systematic review $$771$$ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mn>771</mml:mn> </mml:math> patients...
Abstract The tree of blobs a species network shows only the tree-like aspects relationships taxa on network, omitting information substructures where hybridization or other types lateral transfer genetic occur. By isolating such regions inference can serve as starting point for more detailed investigation, indicate limit what may be inferrable without additional assumptions. Building our theoretical work identifiability from gene quartet distributions under Network Multispecies Coalescent...
Abstract Combining human genomics with proteomics is becoming a powerful tool for drug discovery. Associations between genetic variants and protein levels can uncover disease mechanisms, clinical biomarkers, candidate targets. To date, most population-level proteogenomic studies have focused on common alleles through genome-wide association (GWAS). Here, we studied the contribution of rare protein-coding to 1,472 plasma proteins abundances measured via Olink Explore 1536 assay in 50,829 UK...
The likelihood ratio statistic, with its asymptotic χ2 distribution at regular model points, is often used for hypothesis testing. However, the can differ singularities and boundaries, suggesting use of a might be problematic nearby. Indeed, poor behavior testing near boundaries apparent in simulations, lead to conservative or anti-conservative tests. Here we develop new designed which asymptotically agrees that statistic. For two example trinomial models, arising context inference...
Abstract Background The diagnostic rate of Mendelian disorders in sequencing studies continues to increase, along with the pace novel disease gene discovery. However, variant interpretation genes not currently associated is particularly challenging and strategies combining functional evidence approaches that evaluate phenotypic similarities between patients model organisms have proven successful. A full spectrum intolerance loss-of-function variation has been previously described, providing...
Abstract Background Despite its significant heritability, the genetic underpinnings of Parkinson disease (PD) remain incompletely understood, particularly role rare variants. Advances in population-scale sequencing now provide an unprecedented opportunity to uncover additional large-effect risk factors and expand our understanding mechanisms. Methods We leveraged whole-genome sequence data with linked electronic health records from 490,560 UK Biobank participants, identifying 3,809 PD cases...
Large reference datasets of protein-coding variation in human populations have allowed us to determine which genes and genic subregions are intolerant germline genetic variation. There is also a growing number implicated severe Mendelian diseases that overlap with cancer. We hypothesized cancer-driving mutations might be enriched depleted relative somatic introduce new metric, OncMTR (oncology missense tolerance ratio), uses 125,748 exomes the Genome Aggregation Database (gnomAD) identify...
Abstract Telomeres protect the ends of chromosomes from damage, and genetic regulation their length is associated with human disease ageing. We developed a joint telomere (TL) metric, combining both qPCR whole genome sequencing (WGS) measurements across 462,675 UK Biobank participants that increased our ability to capture TL heritability by 36% (h 2 mean= 0.058 h combined= 0.079) improved predictions age. Exome-wide rare variant (minor allele frequency<0.001) gene-level collapsing...
Artificial Intelligence models encoding biology and chemistry are opening new routes to high-throughput high-quality in-silico drug development. However, their training increasingly relies on computational scale, with recent protein language (pLM) hundreds of graphical processing units (GPUs). We introduce the BioNeMo Framework facilitate AI across GPUs. Its modular design allows integration individual components, such as data loaders, into existing workflows is open community contributions....
ABSTRACT To characterise the somatic alterations in colorectal cancer (CRC), we conducted whole-genome sequencing analysis of 2,023 tumours. We provide most detailed high-resolution map to date mutations CRC, and demonstrate associations with clinicopathological features, particular location large bowel. refined mutational processes signatures acting tumorigenesis. In analyses across sample set or restricted molecular subtypes, identified 185 CRC driver genes, which 117 were previously...
Abstract Autosomal recessive whole gene deletions of nephrocystin-1 ( NPHP1 ) result in abnormal structure and function the primary cilia. These can a tubulointerstitial kidney disease known as nephronophthisis retinal (Senior–Løken syndrome) neurological (Joubert diseases. Nephronophthisis is common cause end-stage (ESKD) children up to 1% adult onset ESKD. Single nucleotide variants (SNVs) small insertions (Indels) have been less well characterised. We used pathogenicity scoring system...