NFDI4DS | UHH-SEMS - Publication Details

Justin Wagner

ORCID: 0009-0003-8903-0504

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5082000812

Research Areas

Genomics and Phylogenetic Studies
Cancer Genomics and Diagnostics
Genomics and Rare Diseases
Chromosomal and Genetic Variations
Genomics and Chromatin Dynamics
Genomic variations and chromosomal abnormalities
Bioinformatics and Genomic Networks
Gut microbiota and health
CRISPR and Genetic Engineering
Gene expression and cancer classification
RNA and protein synthesis mechanisms
Molecular Biology Techniques and Applications
Genetics, Bioinformatics, and Biomedical Research
Evolution and Genetic Dynamics
Law, AI, and Intellectual Property
Cancer-related molecular mechanisms research
RNA Research and Splicing
RNA modifications and cancer
vaccines and immunoinformatics approaches
Genetic and Clinical Aspects of Sex Determination and Chromosomal Abnormalities
Genetic factors in colorectal cancer
Ethics in Clinical Research
Single-cell and spatial transcriptomics
Cell Image Analysis Techniques
Genetic Mapping and Diversity in Plants and Animals

National Institute of Standards and Technology
2019-2025

Material Measurement Laboratory
2019-2025

National Institute of Standards
2022-2024

University of Antwerp
2023

Information Technology Laboratory
2022

Mitre (United States)
2022

University of Alabama in Huntsville
2022

University of Pittsburgh
2022

University of Maryland, College Park
2014-2021

Research Institute for Advanced Computer Science
2017

The complete sequence of a human genome

OPENALEX - Publications

Sergey Nurk Sergey Koren Arang Rhie Mikko Rautiainen Andrey V. Bzikadze and 95 more

Since its initial release in 2000, the human reference genome has covered only euchromatic fraction of genome, leaving important heterochromatic regions unfinished. Addressing remaining 8% Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion–base pair sequence T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors prior references, and introduces nearly 200 million base pairs containing 1956 gene predictions, 99 which are predicted to be...

10.1126/science.abj6987 article EN Science 2022-03-31

An open resource for accurately benchmarking small variant and reference calls

OPENALEX - Publications

Justin M. Zook Jennifer McDaniel Nathan D. Olson Justin Wagner Hemang Parikh and 9 more

10.1038/s41587-019-0074-6 article EN Nature Biotechnology 2019-04-01

A complete reference genome improves analysis of human genetic variation

OPENALEX - Publications

Sergey Aganezov Stephanie M. Yan Daniela C. Soto Melanie Kirsche Samantha Zarate and 28 more

Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 million base pairs of sequence, corrects thousands structural errors, and unlocks most complex regions human for clinical functional study. We show how this reference universally improves read mapping variant calling 3202 17 globally diverse samples sequenced with short long reads, respectively. identify hundreds variants per sample in previously unresolved regions, showcasing promise T2T-CHM13 evolutionary...

10.1126/science.abl3533 article EN Science 2022-03-31

The complete sequence of a human genome

OPENALEX - Publications

Sergey Nurk Sergey Koren Arang Rhie Mikko Rautiainen Andrey V. Bzikadze and 94 more

Abstract In 2001, Celera Genomics and the International Human Genome Sequencing Consortium published their initial drafts of human genome, which revolutionized field genomics. While these updates that followed effectively covered euchromatic fraction heterochromatin many other complex regions were left unfinished or erroneous. Addressing this remaining 8% Telomere-to-Telomere (T2T) has finished first truly complete 3.055 billion base pair (bp) sequence a representing largest improvement to...

10.1101/2021.05.26.445798 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2021-05-27

Curated variation benchmarks for challenging medically relevant autosomal genes

OPENALEX - Publications

Justin Wagner Nathan D. Olson Lindsay Harris Jennifer McDaniel Haoyu Cheng and 32 more

10.1038/s41587-021-01158-1 article EN Nature Biotechnology 2022-02-07

PrecisionFDA Truth Challenge V2: Calling variants from short and long reads in difficult-to-map regions

OPENALEX - Publications

Nathan D. Olson Justin Wagner Jennifer McDaniel Sarah H. Stephens Samuel T. Westreich and 68 more

The precisionFDA Truth Challenge V2 aimed to assess the state of art variant calling in challenging genomic regions. Starting with FASTQs, 20 challenge participants applied their variant-calling pipelines and submitted 64 call sets for one or more sequencing technologies (Illumina, PacBio HiFi, Oxford Nanopore Technologies). Submissions were evaluated following best practices benchmarking small variants updated Genome a Bottle benchmark genome stratifications. submissions included numerous...

10.1016/j.xgen.2022.100129 article EN cc-by Cell Genomics 2022-04-27

Semi-automated assembly of high-quality diploid human reference genomes

OPENALEX - Publications

Erich D. Jarvis Giulio Formenti Arang Rhie Andrea Guarracino Chentao Yang and 78 more

Abstract The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society 1,2 . However, it still many gaps and errors, does not represent biological genome as is blend multiple individuals 3,4 Recently, telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but derived from hydatidiform mole cell line nearly homozygous 5 To address these limitations, Human Pangenome...

10.1038/s41586-022-05325-5 article EN cc-by Nature 2022-10-19

Benchmarking challenging small variants with linked and long reads

OPENALEX - Publications

Justin Wagner Nathan D. Olson Lindsay Harris Ziad Khan Jesse Farek and 36 more

Genome in a Bottle benchmarks are widely used to help validate clinical sequencing pipelines and develop variant calling methods. Here we use accurate linked long reads expand 7 samples include difficult-to-map regions segmental duplications that challenging for short reads. These add more than 300,000 SNVs 50,000 insertions or deletions (indels) 16% exonic variants, many challenging, clinically relevant genes not covered previously, such as PMS2. For HG002, 92% of the autosomal GRCh38...

10.1016/j.xgen.2022.100128 article EN cc-by Cell Genomics 2022-04-28

Analysis and benchmarking of small and large genomic variants across tandem repeats

OPENALEX - Publications

Adam C. English Egor Dolzhenko Helyaneh Ziaei Jam Sean K. McKenzie Nathan D. Olson and 9 more

10.1038/s41587-024-02225-z article EN Nature Biotechnology 2024-04-26

Traumatic Brain Injury in Mice Induces Acute Bacterial Dysbiosis Within the Fecal Microbiome

OPENALEX - Publications

Todd J. Treangen Justin Wagner Mark P. Burns Sonia Villapol

The secondary injury cascade that is activated following traumatic brain (TBI) induces responses from multiple physiological systems, including the immune system. These are not limited to area of injury; they can also alter peripheral organs such as intestinal tract. Gut microbiota play a role in regulation cell populations and microglia activation, microbiome dysbiosis implicated dysregulation behavioral abnormalities. However, changes gut induced after acute TBI remains largely unexplored....

10.3389/fimmu.2018.02757 article EN cc-by Frontiers in Immunology 2018-11-27

A diploid assembly-based benchmark for variants in the major histocompatibility complex

OPENALEX - Publications

Chen-Shan Chin Justin Wagner Qiandong Zeng Erik Garrison Shilpa Garg and 21 more

Abstract Most human genomes are characterized by aligning individual reads to the reference genome, but accurate long and linked now enable us construct accurate, phased de novo assemblies. We focus on a medically important, highly variable, 5 million base-pair (bp) region where diploid assembly is particularly useful - Major Histocompatibility Complex (MHC). Here, we develop genome benchmark derived from for openly-consented Genome in Bottle sample HG002. assemble single contig each...

10.1038/s41467-020-18564-9 article EN cc-by Nature Communications 2020-09-22

Multiscale analysis of pangenomes enables improved representation of genomic diversity for repetitive and clinically relevant genes

OPENALEX - Publications

Chen-Shan Chin Sairam Behera Asif Khalak Fritz J. Sedlazeck Peter H. Sudmant and 2 more

Advancements in sequencing technologies and assembly methods enable the regular production of high-quality genome assemblies characterizing complex regions. However, challenges remain efficiently interpreting variation at various scales, from smaller tandem repeats to megabase rearrangements, across many human genomes. We present a PanGenome Research Tool Kit (PGR-TK) enabling analyses pangenome structural haplotype multiple scales. apply graph decomposition PGR-TK class II major...

10.1038/s41592-023-01914-y article EN cc-by Nature Methods 2023-06-26

Small variant benchmark from a complete assembly of X and Y chromosomes

OPENALEX - Publications

Justin Wagner Nathan D. Olson Jennifer McDaniel Lindsay Harris Brendan J. Pinto and 22 more

Abstract The sex chromosomes contain complex, important genes impacting medical phenotypes, but differ from the autosomes in their ploidy and large repetitive regions. To enable technology developers along with research clinical laboratories to evaluate variant detection on male X Y, we create a small benchmark set 111,725 variants for Genome Bottle HG002 reference material. We develop an active evaluation approach demonstrate reliably identifies errors challenging genomic regions across...

10.1038/s41467-024-55710-z article EN cc-by Nature Communications 2025-01-08

Assembly and annotation of an Ashkenazi human reference genome

OPENALEX - Publications

Alaina Shumate Aleksey V. Zimin Rachel M. Sherman Daniela Puiu Justin Wagner and 5 more

Abstract Background Thousands of experiments and studies use the human reference genome as a resource each year. This single genome, GRCh38, is mosaic created from small number individuals, representing very sample population. There need for genomes multiple populations to avoid potential biases. Results Here, we describe assembly annotation an Ashkenazi individual creation new, population-specific genome. more contiguous complete than latest version annotated with highly similar gene...

10.1186/s13059-020-02047-7 article EN cc-by Genome biology 2020-06-02

A complete reference genome improves analysis of human genetic variation

OPENALEX - Publications

Sergey Aganezov Stephanie M. Yan Daniela C. Soto Melanie Kirsche Samantha Zarate and 28 more

Abstract Compared to its predecessors, the Telomere-to-Telomere CHM13 genome adds nearly 200 Mbp of sequence, corrects thousands structural errors, and unlocks most complex regions human clinical functional study. Here we demonstrate how new reference universally improves read mapping variant calling for 3,202 17 globally diverse samples sequenced with short long reads, respectively. We identify hundreds novel variants per sample—a frontier evolutionary biomedical discovery. Simultaneously,...

10.1101/2021.07.12.452063 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2021-07-13

precisionFDA Truth Challenge V2: Calling variants from short- and long-reads in difficult-to-map regions

OPENALEX - Publications

Nathan D. Olson Justin Wagner Jennifer McDaniel Sarah H. Stephens Samuel T. Westreich and 68 more

Summary The precisionFDA Truth Challenge V2 aimed to assess the state-of-the-art of variant calling in difficult-to-map regions and Major Histocompatibility Complex (MHC). Starting with FASTQ files, 20 challenge participants applied their pipelines submitted 64 callsets for one or more sequencing technologies (~35X Illumina, ~35X PacBio HiFi, ~50X Oxford Nanopore Technologies). Submissions were evaluated following best practices benchmarking small variants new GIAB benchmark sets genome...

10.1101/2020.11.13.380741 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2020-11-15

Benchmarking challenging small variants with linked and long reads

OPENALEX - Publications

Justin Wagner Nathan D. Olson Lindsay Harris Jennifer McDaniel Ziad Khan and 37 more

Summary Genome in a Bottle (GIAB) benchmarks have been widely used to help validate clinical sequencing pipelines and develop new variant calling methods. Here, we use accurate linked reads long expand the prior 7 samples include difficult-to-map regions segmental duplications that are not readily accessible short reads. Our benchmark adds more than 300,000 SNVs, 50,000 indels, 16 % exonic variants, many challenging, clinically relevant genes previously covered (e.g., PMS2 ). For HG002, 92%...

10.1101/2020.07.24.212712 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2020-07-25

Benchmarking of small and large variants across tandem repeats

OPENALEX - Publications

Adam C. English Egor Dolzhenko Helyaneh Ziaei Jam Sean K. McKenzie Nathan D. Olson and 9 more

Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits, and linked to over 60 disease phenotypes. However, their complexity often excludes them from at-scale studies due challenges with variant calling, representation, lack a genome-wide standard. To promote TR methods development, we create comprehensive catalog regions explore its properties across 86 samples. We then curate variants GIAB HG002 individual tandem repeat benchmark. also...

10.1101/2023.10.29.564632 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2023-11-01

The GIAB genomic stratifications resource for human reference genomes

OPENALEX - Publications

Nathan Dwarshuis Divya Kalra Jennifer McDaniel Philippe Sanio Pilar Álvarez Jerez and 9 more

Abstract Despite the growing variety of sequencing and variant-calling tools, no workflow performs equally well across entire human genome. Understanding context-dependent performance is critical for enabling researchers, clinicians, developers to make informed tradeoffs when selecting hardware software. Here we describe a set “stratifications,” which are BED files that define distinct contexts throughout We these GRCh37/38 as new T2T-CHM13 reference, adding many hard-to-sequence regions...

10.1038/s41467-024-53260-y article EN cc-by Nature Communications 2024-10-19

Metaviz: interactive statistical and visual analysis of metagenomic data

OPENALEX - Publications

Justin Wagner Florin Chelaru Jayaram Kancherla Joseph N. Paulson Alexander Zhang and 4 more

Large studies profiling microbial communities and their association with healthy or disease phenotypes are now commonplace. Processed data from many of these publicly available but significant effort is required for users to effectively organize, explore integrate it, limiting the utility rich resources. Effective integrative interactive visual statistical tools analyze metagenomic samples can greatly increase value researchers. We present Metaviz, a tool exploratory analysis annotated...

10.1093/nar/gky136 article EN cc-by-nc Nucleic Acids Research 2018-02-15

Towards a Comprehensive Variation Benchmark for Challenging Medically-Relevant Autosomal Genes

OPENALEX - Publications

Justin Wagner Nathan D. Olson Lindsay Harris Jennifer McDaniel Haoyu Cheng and 32 more

Abstract The repetitive nature and complexity of multiple medically important genes make them intractable to accurate analysis, despite the maturity short-read sequencing, resulting in a gap clinical applications genome sequencing. Genome Bottle Consortium has provided benchmark variant sets, but these excluded some relevant due their repetitiveness or polymorphic complexity. In this study, we characterize 273 395 challenging autosomal that have implications for medical This extended,...

10.1101/2021.06.07.444885 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2021-06-07

Automated assembly of high-quality diploid human reference genomes

OPENALEX - Publications

Erich D. Jarvis Giulio Formenti Arang Rhie Andrea Guarracino Chentao Yang and 74 more

Abstract The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has greatly benefited society 1, 2 . However, it still many gaps and errors, does not represent biological genome since is blend multiple individuals 3, 4 Recently, telomere-to-telomere CHM13, was generated with the latest long-read technologies, but derived from hydatidiform mole cell line duplicate thus nearly homozygous 5 To address these limitations, Human...

10.1101/2022.03.06.483034 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2022-03-06

Development and extensive sequencing of a broadly-consented Genome in a Bottle matched tumor-normal pair for somatic benchmarks

OPENALEX - Publications

Jennifer McDaniel V Patel Nathan D. Olson Hua‐Jun He Zhiyong He and 64 more

Abstract The Genome in a Bottle Consortium (GIAB), hosted by the National Institute of Standards and Technology (NIST), is developing new matched tumor-normal samples, first to be explicitly consented for public dissemination genomic data cell lines. Here, we describe comprehensive dataset from individual, HG008, including DNA an adherent, epithelial-like pancreatic ductal adenocarcinoma (PDAC) tumor line (HG008-T) normal cells duodenal tissue (HG008-N-D) (HG008-N-P). come thirteen whole...

10.1101/2024.09.18.613544 preprint EN public-domain bioRxiv (Cold Spring Harbor Laboratory) 2024-09-22

Coming Soon ...