Joanna Collins

ORCID: 0000-0001-5782-5028
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genomics and Phylogenetic Studies
  • Chromosomal and Genetic Variations
  • Insect symbiosis and bacterial influences
  • Genetic diversity and population structure
  • Insect Resistance and Genetics
  • Genetic Mapping and Diversity in Plants and Animals
  • RNA and protein synthesis mechanisms
  • Genetic and phenotypic traits in livestock
  • Genetic and Clinical Aspects of Sex Determination and Chromosomal Abnormalities
  • Bacteriophages and microbial interactions
  • CRISPR and Genetic Engineering
  • Genomic variations and chromosomal abnormalities
  • Wheat and Barley Genetics and Pathology
  • Plant Disease Resistance and Genetics
  • Nematode management and characterization studies
  • Molecular Biology Techniques and Applications
  • Invertebrate Immune Response Mechanisms
  • Epigenetics and DNA Methylation
  • Genomics and Chromatin Dynamics
  • Evolution and Genetic Dynamics
  • Genetic and Environmental Crop Studies
  • Mollusks and Parasites Studies
  • Environmental DNA in Biodiversity Studies
  • Zebrafish Biomedical Research Applications
  • Animal Behavior and Reproduction

Wellcome Sanger Institute
2016-2025

Shepherd University
2023

Max Planck Institute for Developmental Biology
2013

A high-quality sequence assembly of the zebrafish genome reveals largest gene set any vertebrate and provides information on key genomic features, comparison to human reference shows that approximately 70% protein-coding genes have at least one clear orthologue. The — a model organism for study development disease has now been sequenced published as well-annotated genome. Zebrafish turns out so far sequenced, few pseudogenes. Importantly studies, between sequences obvious second paper...

10.1038/nature12111 article EN cc-by-nc-sa Nature 2013-04-16
Arang Rhie Shane McCarthy Olivier Fédrigo Joana Damas Giulio Formenti and 95 more Sergey Koren Marcela Uliano‐Silva William Chow Arkarachai Fungtammasan Ju‐Wan Kim Chul Lee Byung June Ko Mark Chaisson Gregory Gedman Lindsey Cantin Françoise Thibaud‐Nissen Leanne Haggerty Iliana Bista Michelle Smith Bettina Haase Jacquelyn Mountcastle Sylke Winkler Sadye Paez Jason T. Howard Sonja C. Vernes Tanya M. Lama Frank Grützner Wesley C. Warren Christopher N. Balakrishnan David W. Burt Julia M. George Matthew T. Biegler David Iorns Andrew Digby Daryl Eason Bruce C. Robertson Taylor Edwards Mark Wilkinson George F. Turner Axel Meyer Andreas F. Kautt Paolo Franchini H. William Detrich Hannes Svardal Maximilian Wagner Gavin J. P. Naylor Martin Pippel Milan Malinsky Mark P. Mooney Maria Simbirsky Brett T. Hannigan Trevor Pesout Marlys L. Houck Ann C. Misuraca Sarah B. Kingan Richard Hall Zev Kronenberg Ivan Sović Christopher Dunn Zemin Ning Alex Hastie Joyce Lee Siddarth Selvaraj Richard E. Green Nicholas H. Putnam Marta Gut Jay Ghurye Erik Garrison Ying Sims Joanna Collins Sarah Pelan James Torrance Alan Tracey Jonathan Wood Robel E. Dagnew Dengfeng Guan Sarah E. London David F. Clayton Claudio V. Mello Samantha R. Friedrich Peter V. Lovell Ekaterina Osipova Farooq O. Al-Ajli Simona Secomandi Heebal Kim Constantina Theofanopoulou Michael Hiller Yang Zhou Robert S. Harris Kateryna D. Makova Paul Medvedev Jinna Hoffman Patrick Masterson Karen Clark Fergal J. Martin Kevin Howe Paul Flicek Brian P. Walenz Woori Kwak Hiram Clawson

Abstract High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, biodiversity conservation. However, such available only a few non-microbial species 1–4 . To address this issue, international Genome 10K (G10K) consortium 5,6 has worked over five-year period evaluate develop cost-effective methods assembling highly accurate nearly genomes. Here we present lessons learned from generating 16 that represent six major vertebrate...

10.1038/s41586-021-03451-0 article EN cc-by Nature 2021-04-28

Abstract Genome sequence assemblies provide the basis for our understanding of biology. Generating error-free is therefore ultimate, but sadly still unachieved goal a multitude research projects. Despite ever-advancing improvements in data generation, assembly algorithms and pipelines, no automated approach has so far reliably generated near genome eukaryotes. Whilst working towards improved datasets fully evaluation curation actively used to bridge this shortcoming significantly reduce...

10.1093/gigascience/giaa153 article EN cc-by GigaScience 2021-01-01

The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing update since 2009; it reflects resolution roughly 1000 issues encompasses modifications ranging from thousands single base changes to megabase-scale path reorganizations, gap closures, localization previously orphaned sequences. We developed new approach sequence generation for targeted updates used data mapping technologies haplotype...

10.1101/gr.213611.116 article EN cc-by-nc Genome Research 2017-04-10

I have read the journal's policy and following conflicts: Paul Flicek is married to deputy editor of PLoS Medicine, Melissa Norton. Evan Eichler on board Pacific Biosciences. Support for this work came from Intramural Research Program NIH, The National Library European Molecular Biology Laboratory, Wellcome Trust (grant number 077198), Howard Hughes Medical Institute (EEE). funders had no role in study design, data collection analysis, decision publish or preparation manuscript.

10.1371/journal.pbio.1001091 article EN cc-by PLoS Biology 2011-07-05

We report full-length draft de novo genome assemblies for 16 widely used inbred mouse strains and find extensive strain-specific haplotype variation. identify characterize 2,567 regions on the current reference exhibiting greatest sequence diversity. These are enriched genes involved in pathogen defence immunity exhibit enrichment of transposable elements signatures recent retrotransposition events. Combinations alleles unique to an individual strain commonly observed at these loci,...

10.1038/s41588-018-0223-8 article EN cc-by Nature Genetics 2018-09-25

Abstract The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society 1,2 . However, it still many gaps and errors, does not represent biological genome as is blend multiple individuals 3,4 Recently, telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but derived from hydatidiform mole cell line nearly homozygous 5 To address these limitations, Human Pangenome...

10.1038/s41586-022-05325-5 article EN cc-by Nature 2022-10-19
Arang Rhie Shane McCarthy Olivier Fédrigo Joana Damas Giulio Formenti and 95 more Sergey Koren Marcela Uliano‐Silva William Chow Arkarachai Fungtammasan Gregory Gedman Lindsey Cantin Françoise Thibaud‐Nissen Leanne Haggerty Chul Lee Byung June Ko Ju‐Wan Kim Iliana Bista Michelle Smith Bettina Haase Jacquelyn Mountcastle Sylke Winkler Sadye Paez Jason T. Howard Sonja C. Vernes Tanya M. Lama Frank Grützner Wesley C. Warren Christopher N. Balakrishnan David W. Burt Julia M. George Mathew Biegler David Iorns Andrew Digby Daryl Eason Taylor Edwards Mark Wilkinson George F. Turner Axel Meyer Andreas F. Kautt Paolo Franchini H. William Detrich Hannes Svardal Maximilian Wagner Gavin J. P. Naylor Martin Pippel Milan Malinsky Mark P. Mooney Maria Simbirsky Brett T. Hannigan Trevor Pesout Marlys L. Houck Ann C. Misuraca Sarah B. Kingan Richard Hall Zev Kronenberg Jonas Korlach Ivan Sović Christopher Dunn Zemin Ning Alex Hastie Joyce Lee Siddarth Selvaraj Richard E. Green Nicholas H. Putnam Jay Ghurye Erik Garrison Ying Sims Joanna Collins Sarah Pelan James Torrance Alan Tracey Jonathan Wood Dengfeng Guan Sarah E. London David F. Clayton Claudio V. Mello Samantha R. Friedrich Peter V. Lovell Ekaterina Osipova Farooq O. Al-Ajli Simona Secomandi Heebal Kim Constantina Theofanopoulou Yang Zhou Robert S. Harris Kateryna D. Makova Paul Medvedev Jinna Hoffman Patrick Masterson Karen Clark Fergal J. Martin Kevin Howe Paul Flicek Brian P. Walenz Woori Kwak Hiram Clawson Mark Diekhans Luis R Nassar Benedict Paten R.H. Kraus

Abstract High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, biodiversity conservation. However, such only available a few non-microbial species 1–4 . To address this issue, international Genome 10K (G10K) consortium 5,6 has worked over five-year period evaluate develop cost-effective methods assembling most accurate genomes date. Here we summarize these developments, introduce set quality standards, present lessons...

10.1101/2020.05.22.110833 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2020-05-23

Wheat (Triticum aestivum) is one of the most important food crops with an urgent need for increase in its production to feed growing world. Triticum timopheevii (2n = 4x 28) allotetraploid wheat wild relative species containing A

10.1038/s41597-024-03260-w article EN cc-by Scientific Data 2024-04-23

Insights into the evolution of non-model organisms are limited by lack reference genomes high accuracy, completeness, and contiguity. Here, we present a chromosome-level, karyotype-validated genome pangenome for barn swallow (Hirundo rustica). We complement these resources with reference-free multialignment other bird most comprehensive catalog genetic markers swallow. identify potentially conserved accelerated genes using estimate genome-wide linkage disequilibrium catalog. use to infer...

10.1016/j.celrep.2023.111992 article EN cc-by-nc-nd Cell Reports 2023-01-01

Abstract The human reference genome assembly plays a central role in nearly all aspects of today’s basic and clinical research. GRCh38 is the first coordinate-changing update since 2009 reflects resolution roughly 1000 issues encompasses modifications ranging from thousands single base changes to megabase-scale path reorganizations, gap closures localization previously orphaned sequences. We developed new approach sequence generation for targeted updates used data mapping technologies...

10.1101/072116 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2016-08-29

We present here a high-quality genome assembly of the brown hare (Lepus europaeus Pallas), based on fibroblast cell line male specimen from Liperi, Eastern Finland. This represents first Finnish contribution to European Reference Genome Atlas pilot effort generate reference genomes for biodiversity. The was assembled using 25X PacBio HiFi sequencing data and scaffolded utilizing Hi-C chromosome structure capture approach. After manual curation, length 2,930,972,003 bp with N50 scaffold 125.8...

10.24072/pcjournal.393 article EN cc-by Peer Community Journal 2024-03-05

Abstract The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has greatly benefited society 1, 2 . However, it still many gaps and errors, does not represent biological genome since is blend multiple individuals 3, 4 Recently, telomere-to-telomere CHM13, was generated with the latest long-read technologies, but derived from hydatidiform mole cell line duplicate thus nearly homozygous 5 To address these limitations, Human...

10.1101/2022.03.06.483034 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2022-03-06

Abstract Bread wheat ( Triticum aestivum ) is a vital staple crop, with an urgent need for increased production to help feed the world’s growing population. Aegilops mutica (2n = 2x 14; T genome) diploid wild relative of carrying valuable agronomic traits resulting in its extensive exploitation improvement. This paper reports chromosome-scale, haplotype-resolved genome assembly Ae. using HiFi reads and Omni-C data. The final lengths curated genomes were ~4.65 Gb (haplotype 1) 4.56 2),...

10.1038/s41597-025-04737-y article EN cc-by Scientific Data 2025-03-13

<ns3:p>We present two genome assemblies, each generated from individual female <ns3:italic>Anopheles</ns3:italic> (<ns3:italic>Nyssorhynchus</ns3:italic>) <ns3:italic>darlingi</ns3:italic> (the malaria mosquito; Arthropoda; Insecta; Diptera; Culicidae), wild populations in French Guiana and Peru. The sequences are approximately 180 megabases span. majority of assembly is scaffolded into three chromosomal pseudomolecules with the X sex chromosome assembled. complete mitochondrial genomes were...

10.12688/wellcomeopenres.23989.1 preprint EN cc-by Wellcome Open Research 2025-04-10

In contrast to their dioecious relatives, members of the parthenogenetic Diploscapter nematode genus harbour entire genome within a single pair highly heterozygous chromosomes. To examine how this unusual karyotype relates evolution parthenogenesis, we generated chromosome-level assemblies for two species in clade: pachys and coronatus. Sequence comparisons revealed that genomes are colinear across entirety, multiple ancestral chromosome fusions extensive genomic rearrangements preceded...

10.1101/2025.04.24.650375 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2025-04-27

<ns3:p>The Southern Corroboree frog (<ns3:italic>Pseudophryne corroboree</ns3:italic>; Anura; Myobatrachidae) is a Critically Endangered amphibian, according to the IUCN, and endemic Snowy Mountains region of Kosciuszko National Park in New South Wales, Australia. This species has been driven functional extinction by introduction fungal disease, chytridiomycosis. Here we provide first reference genome for <ns3:italic>P. corroboree</ns3:italic>. Using PacBio HiFi sequencing, Arima Hi-C,...

10.12688/wellcomeopenres.23820.1 preprint EN cc-by Wellcome Open Research 2025-04-30

Combining genome assembly with population and functional genomics can provide valuable insights to development evolution, as well tools for species management. Here, we present a chromosome-level of the common brushtail possum (Trichosurus vulpecula), model marsupial threatened in parts their native range Australia, but also major introduced pest New Zealand. Functional reveals post-natal activation chemosensory metabolic genes, reflecting unique adaptations altricial birth delayed weaning,...

10.1038/s41467-023-41784-8 article EN cc-by Nature Communications 2023-10-17

Improvements in genome sequencing and assembly are enabling high-quality reference genomes for all species. However, the process is still laborious, computationally technically demanding, lacks standards reproducibility, not readily scalable. Here we present latest Vertebrate Genomes Project pipeline demonstrate that it delivers at scale across a set of vertebrate species arising over last ~500 million years. The versatile combines PacBio HiFi long-reads Hi-C-based haplotype phasing new...

10.1101/2023.06.28.546576 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2023-06-30

Abstract Background Large palindromes (inverted repeats) make up substantial proportions of mammalian sex chromosomes, often contain genes, and have high rates structural variation arising via ectopic recombination. As a result, they underlie many genomic disorders. Maintenance the palindromic structure by gene conversion between arms has been documented, but over longer time periods, are remarkably labile. Mechanisms origin loss have, however, received little attention. Results Here, we use...

10.1186/s13059-019-1816-y article EN cc-by Genome biology 2019-10-14

With the advent of chromatin-interaction maps, chromosome-level genome assemblies have become a reality for wide range organisms. Scaffolding quality is, however, difficult to judge. To explore this gap, we generated multiple chromosome-scale an emerging wild animal model carcinogenesis, California sea lion (Zalophus californianus). Short-read were scaffolded with two independent chromatin interaction mapping data sets (Hi-C and Chicago), long-read three types (Hi-C, optical maps 10X linked...

10.1111/1755-0998.13443 article EN cc-by-nc Molecular Ecology Resources 2021-06-08

<ns4:p>We present a genome assembly from an individual female <ns4:italic>Anopheles gambiae</ns4:italic> (the malaria mosquito; Arthropoda; Insecta; Diptera; Culicidae), Ifakara strain. The sequence is 264 megabases in span. Most of the scaffolded into three chromosomal pseudomolecules with X sex chromosome assembled. complete mitochondrial was also assembled and 15.4 kilobases length.</ns4:p>

10.12688/wellcomeopenres.18854.1 preprint EN cc-by Wellcome Open Research 2023-02-13

Abstract The most commonly employed mammalian model organism is the laboratory mouse. A wide variety of genetically diverse inbred mouse strains, representing distinct physiological states, disease susceptibilities, and biological mechanisms have been developed over last century. We report full length draft de novo genome assemblies for 16 widely used strains reveal first time extensive strain-specific haplotype variation. identify characterise 2,567 regions on current Genome Reference...

10.1101/235838 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2018-02-12
Coming Soon ...