- Genomics and Phylogenetic Studies
- Chromosomal and Genetic Variations
- Genomic variations and chromosomal abnormalities
- CRISPR and Genetic Engineering
- RNA and protein synthesis mechanisms
- RNA modifications and cancer
- Nanopore and Nanochannel Transport Studies
- Molecular Biology Techniques and Applications
- Algorithms and Data Compression
- Advanced biosensing and bioanalysis techniques
- Genomics and Chromatin Dynamics
- Genomics and Rare Diseases
- Primate Behavior and Ecology
- RNA Research and Splicing
- Pluripotent Stem Cells Research
- Genetics, Aging, and Longevity in Model Organisms
Jackson Laboratory
2020-2025
University of Arizona
2019
Arizona State University
2017
Diverse inbred mouse strains are important biomedical research models, yet genome characterization of many is fundamentally lacking in comparison with humans. In particular, catalogs structural variants (SVs) (variants ≥ 50 bp) incomplete, limiting the discovery causative alleles for phenotypic variation. Here, we resolve genome-wide SVs 20 genetically distinct mice long-read sequencing. We report 413,758 site-specific affecting 13% (356 Mbp) reference assembly, including 510 previously...
Abstract The most dynamic and repetitive regions of great ape genomes have traditionally been excluded from comparative studies 1–3 . Consequently, our understanding the evolution species is incomplete. Here we present haplotype-resolved reference analyses six species: chimpanzee, bonobo, gorilla, Bornean orangutan, Sumatran orangutan siamang. We achieve chromosome-level contiguity with substantial sequence accuracy (<1 error in 2.7 megabases) completely 215 gapless chromosomes...
Abstract Transposable elements constitute about half of human genomes, and their role in generating variation through retrotransposition is broadly studied appreciated. Structural variants mediated by transposons, which we call transposable element-mediated rearrangements (TEMRs), are less well studied, the mechanisms leading to formation as broader impact on diversity poorly understood. Here, identify 493 unique TEMRs across genomes three individuals. While homology directed repair dominant...
We present haplotype-resolved reference genomes and comparative analyses of six ape species, namely: chimpanzee, bonobo, gorilla, Bornean orangutan, Sumatran siamang. achieve chromosome-level contiguity with unparalleled sequence accuracy (<1 error in 500,000 base pairs), completely sequencing 215 gapless chromosomes telomere-to-telomere. resolve challenging regions, such as the major histocompatibility complex immunoglobulin loci, providing more in-depth evolutionary insights. Comparative...
Diverse sets of complete human genomes are required to construct a pangenome reference and understand the extent complex structural variation. Here, we sequence 65 diverse build 130 haplotype-resolved assemblies (130 Mbp median continuity), closing 92% all previous assembly gaps reaching telomere-to-telomere (T2T) status for 39% chromosomes. We highlight continuity loci, including major histocompatibility (MHC), SMN1/SMN2, NBPF8, AMY1/AMY2, fully resolve 1,852 variants (SVs). In addition,...
Structural variants (SVs) are implicated in the etiology of Mendelian diseases but have been systematically underascertained owing to sequencing technology limitations. Long-read enables comprehensive detection SVs, approaches for prioritization candidate SVs needed. variant Annotation and analysis (SvAnna) assesses all classes their intersection with transcripts regulatory sequences, relating predicted effects on gene function clinical phenotype data. SvAnna places 87% deleterious top ten...
CRISPR-based technologies have become central to genome engineering. However, editing strategies are dependent on the repair of DNA breaks via endogenous mechanisms, which increases susceptibility unwanted mutations. Here we complement Cas9 with a recombinase's functionality by fusing hyperactive mutant resolvase from transposon Tn3, member serine recombinases, catalytically inactive Cas9, term integrase (iCas9). We demonstrate iCas9 targets deletion and integration. First, validate iCas9's...
SUMMARY Diverse inbred mouse strains are among the foremost models for biomedical research, yet genome characterization of many has been fundamentally lacking in comparison to human genomics research. In particular, discovery and cataloging structural variants is incomplete, limiting potentially causative alleles phenotypic variation across individuals. Here, we utilized long-read sequencing resolve genome-wide (SVs, ≥ 50 bp) 20 genetically distinct mice. We report 413,758 site-specific SVs...
Nanopores represent the first commercial technology in decades to present a significantly different technique for DNA sequencing, and one of technologies propose direct RNA sequencing. Despite significant differences with previous sequencing technologies, read simulators date make similar assumptions respect error profiles their analysis, resulting incorrect characterization nanopore error. This is great disservice both computer scientists who seek optimize tools platform. Previous works...
Abstract Nanopore sequencing has introduced the ability to sequence long stretches of DNA, enabling resolution repeating segments, or paired SNPs across DNA. Unfortunately significant error rates >15%, through systematic and random noise inhibit downstream analysis. We propose a novel method, using unsupervised learning, correct biologically amplified reads before analysis proceeds. also demonstrate that our method performance comparable existing techniques without limiting detection...
ABSTRACT Transposable elements constitute about half of human genomes, and their role in generating variation through retrotransposition is broadly studied appreciated. Structural variants mediated by transposons, which we call transposable element-mediated rearrangements (TEMRs), are less well studied, the mechanisms leading to formation as broader impact on diversity poorly understood. Here, identify 493 unique TEMRs across genomes three individuals. While homology directed repair dominant...
Abstract Nanopores represent the first commercial technology in decades to present a significantly different technique for DNA sequencing, and one of technologies propose direct RNA sequencing. Despite significant differences with previous sequencing technologies, read simulators date make similar assumptions respect error profiles their analysis. This is great disservice both nanopore computer scientists who seek optimize tools platform. Previous works have discussed occurrence some k-mer...
Nanopore sequencing has introduced the ability to sequence long stretches of DNA, enabling resolution repeating segments, or paired SNPs across DNA. Unfortunately, significant error rates >15%, through systematic and random noise inhibit downstream analysis. We propose a novel method, using unsupervised learning, correct biologically amplified reads before analysis proceeds. also demonstrate that our method performance comparable existing techniques without limiting detection repeats, length...
Abstract Structural variants (SVs) are implicated in the etiology of Mendelian diseases but have been systematically underascertained owing to limitations existing technology. Recent technological advances such as long-read sequencing (LRS) enable more comprehensive detection SVs, approaches for clinical prioritization candidate SVs needed. Existing computational do not specifically target LRS data, thereby missing a substantial proportion and provide unified model assessing all types SVs....