Justin Wagner

ORCID: 0009-0003-8903-0504
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genomics and Phylogenetic Studies
  • Cancer Genomics and Diagnostics
  • Genomics and Rare Diseases
  • Chromosomal and Genetic Variations
  • Genomics and Chromatin Dynamics
  • Genomic variations and chromosomal abnormalities
  • Bioinformatics and Genomic Networks
  • Gut microbiota and health
  • CRISPR and Genetic Engineering
  • Gene expression and cancer classification
  • RNA and protein synthesis mechanisms
  • Molecular Biology Techniques and Applications
  • Genetics, Bioinformatics, and Biomedical Research
  • Evolution and Genetic Dynamics
  • Law, AI, and Intellectual Property
  • Cancer-related molecular mechanisms research
  • RNA Research and Splicing
  • RNA modifications and cancer
  • vaccines and immunoinformatics approaches
  • Genetic and Clinical Aspects of Sex Determination and Chromosomal Abnormalities
  • Genetic factors in colorectal cancer
  • Ethics in Clinical Research
  • Single-cell and spatial transcriptomics
  • Cell Image Analysis Techniques
  • Genetic Mapping and Diversity in Plants and Animals

National Institute of Standards and Technology
2019-2025

Material Measurement Laboratory
2019-2025

National Institute of Standards
2022-2024

University of Antwerp
2023

Information Technology Laboratory
2022

Mitre (United States)
2022

University of Alabama in Huntsville
2022

University of Pittsburgh
2022

University of Maryland, College Park
2014-2021

Research Institute for Advanced Computer Science
2017

Sergey Nurk Sergey Koren Arang Rhie Mikko Rautiainen Andrey V. Bzikadze and 95 more Alla Mikheenko Mitchell R. Vollger Nicolas Altemose Lev Uralsky Ariel Gershman Sergey Aganezov Savannah J. Hoyt Mark Diekhans Glennis A. Logsdon Michael Alonge Stylianos E. Antonarakis Matthew Borchers Gerard G. Bouffard Shelise Brooks Gina V. Caldas Nae-Chyun Chen Haoyu Cheng Chen-Shan Chin William Chow Leonardo Gomes de Lima Philip C. Dishuck Richard Durbin Tatiana Dvorkina Ian T. Fiddes Giulio Formenti Robert S. Fulton Arkarachai Fungtammasan Erik Garrison Patrick G. S. Grady Tina A. Graves-Lindsay Ira M. Hall Nancy F. Hansen Gabrielle A. Hartley Marina Haukness Kerstin Howe Michael W. Hunkapiller Chirag Jain Miten Jain Erich D. Jarvis Peter Kerpedjiev Melanie Kirsche Mikhail Kolmogorov Jonas Korlach Milinn Kremitzki Heng Li Valerie V. Maduro Tobias Marschall Ann M. Mc Cartney Jennifer McDaniel Danny E. Miller James C. Mullikin Eugene W. Myers Nathan D. Olson Benedict Paten Paul Peluso Pavel A. Pevzner David Porubský Tamara Potapova Е. И. Рогаев Jeffrey Rosenfeld Steven L. Salzberg Valérie Schneider Fritz J. Sedlazeck Kishwar Shafin Colin J. Shew Alaina Shumate Ying Sims Arian F. A. Smit Daniela C. Soto Ivan Sović Jessica M. Storer Aaron Streets Beth A. Sullivan Françoise Thibaud‐Nissen James Torrance Justin Wagner Brian P. Walenz Aaron M. Wenger Jonathan Wood Chunlin Xiao Stephanie M. Yan Alice Young Samantha Zarate Urvashi Surti Rajiv C. McCoy Megan Y. Dennis Ivan A. Alexandrov Jennifer L. Gerton Rachel J. O’Neill Winston Timp Justin M. Zook Michael C. Schatz Evan E. Eichler Karen H. Miga Adam M. Phillippy

Since its initial release in 2000, the human reference genome has covered only euchromatic fraction of genome, leaving important heterochromatic regions unfinished. Addressing remaining 8% Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion–base pair sequence T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors prior references, and introduces nearly 200 million base pairs containing 1956 gene predictions, 99 which are predicted to be...

10.1126/science.abj6987 article EN Science 2022-03-31

Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 million base pairs of sequence, corrects thousands structural errors, and unlocks most complex regions human for clinical functional study. We show how this reference universally improves read mapping variant calling 3202 17 globally diverse samples sequenced with short long reads, respectively. identify hundreds variants per sample in previously unresolved regions, showcasing promise T2T-CHM13 evolutionary...

10.1126/science.abl3533 article EN Science 2022-03-31
Sergey Nurk Sergey Koren Arang Rhie Mikko Rautiainen Andrey V. Bzikadze and 94 more Alla Mikheenko Mitchell R. Vollger Nicolas Altemose Lev Uralsky Ariel Gershman Sergey Aganezov Savannah J. Hoyt Mark Diekhans Glennis A. Logsdon Michael Alonge Stylianos E. Antonarakis Matthew Borchers Gerard G. Bouffard Shelise Brooks Gina V. Caldas Haoyu Cheng Chen-Shan Chin William Chow Leonardo Gomes de Lima Philip C. Dishuck Richard Durbin Tatiana Dvorkina Ian T. Fiddes Giulio Formenti Robert S. Fulton Arkarachai Fungtammasan Erik Garrison Patrick G. S. Grady Tina A. Graves-Lindsay Ira M. Hall Nancy F. Hansen Gabrielle A. Hartley Marina Haukness Kerstin Howe Michael W. Hunkapiller Chirag Jain Miten Jain Erich D. Jarvis Peter Kerpedjiev Melanie Kirsche Mikhail Kolmogorov Jonas Korlach Milinn Kremitzki Heng Li Valerie V. Maduro Tobias Marschall Ann M. Mc Cartney Jennifer McDaniel Danny E. Miller James C. Mullikin Eugene W. Myers Nathan D. Olson Benedict Paten Paul Peluso Pavel A. Pevzner David Porubský Tamara Potapova Е. И. Рогаев Jeffrey Rosenfeld Steven L. Salzberg Valérie Schneider Fritz J. Sedlazeck Kishwar Shafin Colin J. Shew Alaina Shumate Yumi Sims Arian F. A. Smit Daniela C. Soto Ivan Sović Jessica M. Storer Aaron Streets Beth A. Sullivan Françoise Thibaud‐Nissen James Torrance Justin Wagner Brian P. Walenz Aaron M. Wenger Jonathan Wood Chunlin Xiao Stephanie M. Yan Alice Young Samantha Zarate Urvashi Surti Rajiv C. McCoy Megan Y. Dennis Ivan A. Alexandrov Jennifer L. Gerton Rachel J. O’Neill Winston Timp Justin M. Zook Michael C. Schatz Evan E. Eichler Karen H. Miga Adam M. Phillippy

Abstract In 2001, Celera Genomics and the International Human Genome Sequencing Consortium published their initial drafts of human genome, which revolutionized field genomics. While these updates that followed effectively covered euchromatic fraction heterochromatin many other complex regions were left unfinished or erroneous. Addressing this remaining 8% Telomere-to-Telomere (T2T) has finished first truly complete 3.055 billion base pair (bp) sequence a representing largest improvement to...

10.1101/2021.05.26.445798 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2021-05-27

The precisionFDA Truth Challenge V2 aimed to assess the state of art variant calling in challenging genomic regions. Starting with FASTQs, 20 challenge participants applied their variant-calling pipelines and submitted 64 call sets for one or more sequencing technologies (Illumina, PacBio HiFi, Oxford Nanopore Technologies). Submissions were evaluated following best practices benchmarking small variants updated Genome a Bottle benchmark genome stratifications. submissions included numerous...

10.1016/j.xgen.2022.100129 article EN cc-by Cell Genomics 2022-04-27

Abstract The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society 1,2 . However, it still many gaps and errors, does not represent biological genome as is blend multiple individuals 3,4 Recently, telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but derived from hydatidiform mole cell line nearly homozygous 5 To address these limitations, Human Pangenome...

10.1038/s41586-022-05325-5 article EN cc-by Nature 2022-10-19

Genome in a Bottle benchmarks are widely used to help validate clinical sequencing pipelines and develop variant calling methods. Here we use accurate linked long reads expand 7 samples include difficult-to-map regions segmental duplications that challenging for short reads. These add more than 300,000 SNVs 50,000 insertions or deletions (indels) 16% exonic variants, many challenging, clinically relevant genes not covered previously, such as PMS2. For HG002, 92% of the autosomal GRCh38...

10.1016/j.xgen.2022.100128 article EN cc-by Cell Genomics 2022-04-28

The secondary injury cascade that is activated following traumatic brain (TBI) induces responses from multiple physiological systems, including the immune system. These are not limited to area of injury; they can also alter peripheral organs such as intestinal tract. Gut microbiota play a role in regulation cell populations and microglia activation, microbiome dysbiosis implicated dysregulation behavioral abnormalities. However, changes gut induced after acute TBI remains largely unexplored....

10.3389/fimmu.2018.02757 article EN cc-by Frontiers in Immunology 2018-11-27

Abstract Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long and linked now enable us construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assembly is particularly useful - Major Histocompatibility Complex (MHC). Here, we develop genome benchmark derived from for openly-consented Genome in Bottle sample HG002. assemble single contig each...

10.1038/s41467-020-18564-9 article EN cc-by Nature Communications 2020-09-22

Advancements in sequencing technologies and assembly methods enable the regular production of high-quality genome assemblies characterizing complex regions. However, challenges remain efficiently interpreting variation at various scales, from smaller tandem repeats to megabase rearrangements, across many human genomes. We present a PanGenome Research Tool Kit (PGR-TK) enabling analyses pangenome structural haplotype multiple scales. apply graph decomposition PGR-TK class II major...

10.1038/s41592-023-01914-y article EN cc-by Nature Methods 2023-06-26

Abstract The sex chromosomes contain complex, important genes impacting medical phenotypes, but differ from the autosomes in their ploidy and large repetitive regions. To enable technology developers along with research clinical laboratories to evaluate variant detection on male X Y, we create a small benchmark set 111,725 variants for Genome Bottle HG002 reference material. We develop an active evaluation approach demonstrate reliably identifies errors challenging genomic regions across...

10.1038/s41467-024-55710-z article EN cc-by Nature Communications 2025-01-08

Abstract Background Thousands of experiments and studies use the human reference genome as a resource each year. This single genome, GRCh38, is mosaic created from small number individuals, representing very sample population. There need for genomes multiple populations to avoid potential biases. Results Here, we describe assembly annotation an Ashkenazi individual creation new, population-specific genome. more contiguous complete than latest version annotated with highly similar gene...

10.1186/s13059-020-02047-7 article EN cc-by Genome biology 2020-06-02

Abstract Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 Mbp of sequence, corrects thousands structural errors, and unlocks most complex regions human clinical functional study. Here we demonstrate how new reference universally improves read mapping variant calling for 3,202 17 globally diverse samples sequenced with short long reads, respectively. We identify hundreds novel variants per sample—a frontier evolutionary biomedical discovery. Simultaneously,...

10.1101/2021.07.12.452063 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2021-07-13

Summary The precisionFDA Truth Challenge V2 aimed to assess the state-of-the-art of variant calling in difficult-to-map regions and Major Histocompatibility Complex (MHC). Starting with FASTQ files, 20 challenge participants applied their pipelines submitted 64 callsets for one or more sequencing technologies (~35X Illumina, ~35X PacBio HiFi, ~50X Oxford Nanopore Technologies). Submissions were evaluated following best practices benchmarking small variants new GIAB benchmark sets genome...

10.1101/2020.11.13.380741 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2020-11-15

Summary Genome in a Bottle (GIAB) benchmarks have been widely used to help validate clinical sequencing pipelines and develop new variant calling methods. Here, we use accurate linked reads long expand the prior 7 samples include difficult-to-map regions segmental duplications that are not readily accessible short reads. Our benchmark adds more than 300,000 SNVs, 50,000 indels, 16 % exonic variants, many challenging, clinically relevant genes previously covered (e.g., PMS2 ). For HG002, 92%...

10.1101/2020.07.24.212712 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2020-07-25

Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits, and linked to over 60 disease phenotypes. However, their complexity often excludes them from at-scale studies due challenges with variant calling, representation, lack a genome-wide standard. To promote TR methods development, we create comprehensive catalog regions explore its properties across 86 samples. We then curate variants GIAB HG002 individual tandem repeat benchmark. also...

10.1101/2023.10.29.564632 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2023-11-01

Abstract Despite the growing variety of sequencing and variant-calling tools, no workflow performs equally well across entire human genome. Understanding context-dependent performance is critical for enabling researchers, clinicians, developers to make informed tradeoffs when selecting hardware software. Here we describe a set “stratifications,” which are BED files that define distinct contexts throughout We these GRCh37/38 as new T2T-CHM13 reference, adding many hard-to-sequence regions...

10.1038/s41467-024-53260-y article EN cc-by Nature Communications 2024-10-19

Large studies profiling microbial communities and their association with healthy or disease phenotypes are now commonplace. Processed data from many of these publicly available but significant effort is required for users to effectively organize, explore integrate it, limiting the utility rich resources. Effective integrative interactive visual statistical tools analyze metagenomic samples can greatly increase value researchers. We present Metaviz, a tool exploratory analysis annotated...

10.1093/nar/gky136 article EN cc-by-nc Nucleic Acids Research 2018-02-15

Abstract The repetitive nature and complexity of multiple medically important genes make them intractable to accurate analysis, despite the maturity short-read sequencing, resulting in a gap clinical applications genome sequencing. Genome Bottle Consortium has provided benchmark variant sets, but these excluded some relevant due their repetitiveness or polymorphic complexity. In this study, we characterize 273 395 challenging autosomal that have implications for medical This extended,...

10.1101/2021.06.07.444885 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2021-06-07

Abstract The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has greatly benefited society 1, 2 . However, it still many gaps and errors, does not represent biological genome since is blend multiple individuals 3, 4 Recently, telomere-to-telomere CHM13, was generated with the latest long-read technologies, but derived from hydatidiform mole cell line duplicate thus nearly homozygous 5 To address these limitations, Human...

10.1101/2022.03.06.483034 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2022-03-06

Abstract The Genome in a Bottle Consortium (GIAB), hosted by the National Institute of Standards and Technology (NIST), is developing new matched tumor-normal samples, first to be explicitly consented for public dissemination genomic data cell lines. Here, we describe comprehensive dataset from individual, HG008, including DNA an adherent, epithelial-like pancreatic ductal adenocarcinoma (PDAC) tumor line (HG008-T) normal cells duodenal tissue (HG008-N-D) (HG008-N-P). come thirteen whole...

10.1101/2024.09.18.613544 preprint EN public-domain bioRxiv (Cold Spring Harbor Laboratory) 2024-09-22
Coming Soon ...