Joanna Collins
- Genomics and Phylogenetic Studies
- Chromosomal and Genetic Variations
- Insect symbiosis and bacterial influences
- Genetic diversity and population structure
- Insect Resistance and Genetics
- Genetic Mapping and Diversity in Plants and Animals
- RNA and protein synthesis mechanisms
- Genetic and phenotypic traits in livestock
- Genetic and Clinical Aspects of Sex Determination and Chromosomal Abnormalities
- Bacteriophages and microbial interactions
- CRISPR and Genetic Engineering
- Genomic variations and chromosomal abnormalities
- Wheat and Barley Genetics and Pathology
- Plant Disease Resistance and Genetics
- Nematode management and characterization studies
- Molecular Biology Techniques and Applications
- Invertebrate Immune Response Mechanisms
- Epigenetics and DNA Methylation
- Genomics and Chromatin Dynamics
- Evolution and Genetic Dynamics
- Genetic and Environmental Crop Studies
- Mollusks and Parasites Studies
- Environmental DNA in Biodiversity Studies
- Zebrafish Biomedical Research Applications
- Animal Behavior and Reproduction
Wellcome Sanger Institute
2016-2025
Shepherd University
2023
Max Planck Institute for Developmental Biology
2013
A high-quality sequence assembly of the zebrafish genome reveals largest gene set any vertebrate and provides information on key genomic features, comparison to human reference shows that approximately 70% protein-coding genes have at least one clear orthologue. The — a model organism for study development disease has now been sequenced published as well-annotated genome. Zebrafish turns out so far sequenced, few pseudogenes. Importantly studies, between sequences obvious second paper...
Abstract High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, biodiversity conservation. However, such available only a few non-microbial species 1–4 . To address this issue, international Genome 10K (G10K) consortium 5,6 has worked over five-year period evaluate develop cost-effective methods assembling highly accurate nearly genomes. Here we present lessons learned from generating 16 that represent six major vertebrate...
Abstract Genome sequence assemblies provide the basis for our understanding of biology. Generating error-free is therefore ultimate, but sadly still unachieved goal a multitude research projects. Despite ever-advancing improvements in data generation, assembly algorithms and pipelines, no automated approach has so far reliably generated near genome eukaryotes. Whilst working towards improved datasets fully evaluation curation actively used to bridge this shortcoming significantly reduce...
The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing update since 2009; it reflects resolution roughly 1000 issues encompasses modifications ranging from thousands single base changes to megabase-scale path reorganizations, gap closures, localization previously orphaned sequences. We developed new approach sequence generation for targeted updates used data mapping technologies haplotype...
I have read the journal's policy and following conflicts: Paul Flicek is married to deputy editor of PLoS Medicine, Melissa Norton. Evan Eichler on board Pacific Biosciences. Support for this work came from Intramural Research Program NIH, The National Library European Molecular Biology Laboratory, Wellcome Trust (grant number 077198), Howard Hughes Medical Institute (EEE). funders had no role in study design, data collection analysis, decision publish or preparation manuscript.
We report full-length draft de novo genome assemblies for 16 widely used inbred mouse strains and find extensive strain-specific haplotype variation. identify characterize 2,567 regions on the current reference exhibiting greatest sequence diversity. These are enriched genes involved in pathogen defence immunity exhibit enrichment of transposable elements signatures recent retrotransposition events. Combinations alleles unique to an individual strain commonly observed at these loci,...
Abstract The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society 1,2 . However, it still many gaps and errors, does not represent biological genome as is blend multiple individuals 3,4 Recently, telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but derived from hydatidiform mole cell line nearly homozygous 5 To address these limitations, Human Pangenome...
Abstract High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, biodiversity conservation. However, such only available a few non-microbial species 1–4 . To address this issue, international Genome 10K (G10K) consortium 5,6 has worked over five-year period evaluate develop cost-effective methods assembling most accurate genomes date. Here we summarize these developments, introduce set quality standards, present lessons...
Wheat (Triticum aestivum) is one of the most important food crops with an urgent need for increase in its production to feed growing world. Triticum timopheevii (2n = 4x 28) allotetraploid wheat wild relative species containing A
Insights into the evolution of non-model organisms are limited by lack reference genomes high accuracy, completeness, and contiguity. Here, we present a chromosome-level, karyotype-validated genome pangenome for barn swallow (Hirundo rustica). We complement these resources with reference-free multialignment other bird most comprehensive catalog genetic markers swallow. identify potentially conserved accelerated genes using estimate genome-wide linkage disequilibrium catalog. use to infer...
Abstract The human reference genome assembly plays a central role in nearly all aspects of today’s basic and clinical research. GRCh38 is the first coordinate-changing update since 2009 reflects resolution roughly 1000 issues encompasses modifications ranging from thousands single base changes to megabase-scale path reorganizations, gap closures localization previously orphaned sequences. We developed new approach sequence generation for targeted updates used data mapping technologies...
We present here a high-quality genome assembly of the brown hare (Lepus europaeus Pallas), based on fibroblast cell line male specimen from Liperi, Eastern Finland. This represents first Finnish contribution to European Reference Genome Atlas pilot effort generate reference genomes for biodiversity. The was assembled using 25X PacBio HiFi sequencing data and scaffolded utilizing Hi-C chromosome structure capture approach. After manual curation, length 2,930,972,003 bp with N50 scaffold 125.8...
Abstract The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has greatly benefited society 1, 2 . However, it still many gaps and errors, does not represent biological genome since is blend multiple individuals 3, 4 Recently, telomere-to-telomere CHM13, was generated with the latest long-read technologies, but derived from hydatidiform mole cell line duplicate thus nearly homozygous 5 To address these limitations, Human...
Abstract Bread wheat ( Triticum aestivum ) is a vital staple crop, with an urgent need for increased production to help feed the world’s growing population. Aegilops mutica (2n = 2x 14; T genome) diploid wild relative of carrying valuable agronomic traits resulting in its extensive exploitation improvement. This paper reports chromosome-scale, haplotype-resolved genome assembly Ae. using HiFi reads and Omni-C data. The final lengths curated genomes were ~4.65 Gb (haplotype 1) 4.56 2),...
<ns3:p>We present two genome assemblies, each generated from individual female <ns3:italic>Anopheles</ns3:italic> (<ns3:italic>Nyssorhynchus</ns3:italic>) <ns3:italic>darlingi</ns3:italic> (the malaria mosquito; Arthropoda; Insecta; Diptera; Culicidae), wild populations in French Guiana and Peru. The sequences are approximately 180 megabases span. majority of assembly is scaffolded into three chromosomal pseudomolecules with the X sex chromosome assembled. complete mitochondrial genomes were...
In contrast to their dioecious relatives, members of the parthenogenetic Diploscapter nematode genus harbour entire genome within a single pair highly heterozygous chromosomes. To examine how this unusual karyotype relates evolution parthenogenesis, we generated chromosome-level assemblies for two species in clade: pachys and coronatus. Sequence comparisons revealed that genomes are colinear across entirety, multiple ancestral chromosome fusions extensive genomic rearrangements preceded...
<ns3:p>The Southern Corroboree frog (<ns3:italic>Pseudophryne corroboree</ns3:italic>; Anura; Myobatrachidae) is a Critically Endangered amphibian, according to the IUCN, and endemic Snowy Mountains region of Kosciuszko National Park in New South Wales, Australia. This species has been driven functional extinction by introduction fungal disease, chytridiomycosis. Here we provide first reference genome for <ns3:italic>P. corroboree</ns3:italic>. Using PacBio HiFi sequencing, Arima Hi-C,...
Combining genome assembly with population and functional genomics can provide valuable insights to development evolution, as well tools for species management. Here, we present a chromosome-level of the common brushtail possum (Trichosurus vulpecula), model marsupial threatened in parts their native range Australia, but also major introduced pest New Zealand. Functional reveals post-natal activation chemosensory metabolic genes, reflecting unique adaptations altricial birth delayed weaning,...
Improvements in genome sequencing and assembly are enabling high-quality reference genomes for all species. However, the process is still laborious, computationally technically demanding, lacks standards reproducibility, not readily scalable. Here we present latest Vertebrate Genomes Project pipeline demonstrate that it delivers at scale across a set of vertebrate species arising over last ~500 million years. The versatile combines PacBio HiFi long-reads Hi-C-based haplotype phasing new...
Abstract Background Large palindromes (inverted repeats) make up substantial proportions of mammalian sex chromosomes, often contain genes, and have high rates structural variation arising via ectopic recombination. As a result, they underlie many genomic disorders. Maintenance the palindromic structure by gene conversion between arms has been documented, but over longer time periods, are remarkably labile. Mechanisms origin loss have, however, received little attention. Results Here, we use...
With the advent of chromatin-interaction maps, chromosome-level genome assemblies have become a reality for wide range organisms. Scaffolding quality is, however, difficult to judge. To explore this gap, we generated multiple chromosome-scale an emerging wild animal model carcinogenesis, California sea lion (Zalophus californianus). Short-read were scaffolded with two independent chromatin interaction mapping data sets (Hi-C and Chicago), long-read three types (Hi-C, optical maps 10X linked...
<ns4:p>We present a genome assembly from an individual female <ns4:italic>Anopheles gambiae</ns4:italic> (the malaria mosquito; Arthropoda; Insecta; Diptera; Culicidae), Ifakara strain. The sequence is 264 megabases in span. Most of the scaffolded into three chromosomal pseudomolecules with X sex chromosome assembled. complete mitochondrial was also assembled and 15.4 kilobases length.</ns4:p>
Abstract The most commonly employed mammalian model organism is the laboratory mouse. A wide variety of genetically diverse inbred mouse strains, representing distinct physiological states, disease susceptibilities, and biological mechanisms have been developed over last century. We report full length draft de novo genome assemblies for 16 widely used strains reveal first time extensive strain-specific haplotype variation. identify characterise 2,567 regions on current Genome Reference...