NFDI4DS | UHH-SEMS - Publication Details

Mark Chaisson

ORCID: 0000-0001-5395-1457

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5026460408

Research Areas

Genomics and Phylogenetic Studies
RNA and protein synthesis mechanisms
Chromosomal and Genetic Variations
Genomics and Rare Diseases
Genomic variations and chromosomal abnormalities
RNA modifications and cancer
Genetic Associations and Epidemiology
Genetic Mapping and Diversity in Plants and Animals
Genetic Neurodegenerative Diseases
Algorithms and Data Compression
CRISPR and Genetic Engineering
Genomics and Chromatin Dynamics
Bioinformatics and Genomic Networks
Advanced biosensing and bioanalysis techniques
Machine Learning in Bioinformatics
Molecular Biology Techniques and Applications
RNA Research and Splicing
Cancer Genomics and Diagnostics
Evolution and Genetic Dynamics
Planetary Science and Exploration
Genetic factors in colorectal cancer
Gene expression and cancer classification
Marine animal studies overview
Genetics, Bioinformatics, and Biomedical Research
Genome Rearrangement Algorithms

University of Southern California
2017-2025

USC Norris Comprehensive Cancer Center
2023-2024

Southern California University for Professional Studies
2021-2023

University of Washington
2014-2021

LAC+USC Medical Center
2021

Seattle University
2014

Pacific Biosciences (United States)
2011-2012

Cold Spring Harbor Laboratory
2012

University of California, San Diego
2001-2008

STAR: ultrafast universal RNA-seq aligner

OPENALEX - Publications

Alexander Dobin Carrie Davis Felix Schlesinger Jörg Drenkow Chris Zaleski and 4 more

Motivation: Accurate alignment of high-throughput RNA-seq data is a challenging and yet unsolved problem because the non-contiguous transcript structure, relatively short read lengths constantly increasing throughput sequencing technologies. Currently available aligners suffer from high mapping error rates, low speed, length limitation biases. Results: To align our large (>80 billon reads) ENCODE Transcriptome dataset, we developed Spliced Transcripts Alignment to Reference (STAR) software...

10.1093/bioinformatics/bts635 article EN Bioinformatics 2012-10-25

An integrated map of structural variation in 2,504 human genomes

OPENALEX - Publications

Peter H. Sudmant Tobias Rausch Eugene J. Gardner Robert E. Handsaker Alexej Abyzov and 77 more

Structural variants are implicated in numerous diseases and make up the majority of varying nucleotides among human genomes. Here we describe an integrated set eight structural variant classes comprising both balanced unbalanced variants, which constructed using short-read DNA sequencing data statistically phased onto haplotype blocks 26 populations. Analysing this set, identify gene-intersecting exhibiting population stratification naturally occurring homozygous gene knockouts that suggest...

10.1038/nature15394 article EN cc-by-nc-sa Nature 2015-09-29

Towards complete and error-free genome assemblies of all vertebrate species

OPENALEX - Publications

Arang Rhie Shane McCarthy Olivier Fédrigo Joana Damas Giulio Formenti and 95 more

Abstract High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, biodiversity conservation. However, such available only a few non-microbial species 1–4 . To address this issue, international Genome 10K (G10K) consortium 5,6 has worked over five-year period evaluate develop cost-effective methods assembling highly accurate nearly genomes. Here we present lessons learned from generating 16 that represent six major vertebrate...

10.1038/s41586-021-03451-0 article EN cc-by Nature 2021-04-28

Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory

OPENALEX - Publications

Mark Chaisson Glenn Tesler

Recent methods have been developed to perform high-throughput sequencing of DNA by Single Molecule Sequencing (SMS). While Next-Generation may produce reads up several hundred bases long, SMS produces tens kilobases long. Existing alignment are either too inefficient for datasets, or not sensitive enough align reads, which a higher error rate than sequencing.We describe the method BLASR (Basic Local Alignment with Successive Refinement) mapping (SMS) that thousands divergence between read...

10.1186/1471-2105-13-238 article EN cc-by BMC Bioinformatics 2012-09-19

Multi-platform discovery of haplotype-resolved structural variation in human genomes

OPENALEX - Publications

Mark Chaisson Ashley D. Sanders Xuefang Zhao Ankit Malhotra David Porubský and 92 more

The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies human genetic diversity and disease association. Here, we apply a suite long-read, short-read, strand-specific technologies, optical mapping, variant discovery algorithms to comprehensively analyze three trios define the full spectrum variation in haplotype-resolved manner. We identify 818,054 indel (<50 bp) 27,622 SVs (≥50 per genome. also discover 156 inversions genome 58 intersect...

10.1038/s41467-018-08148-z article EN cc-by Nature Communications 2019-04-16

Resolving the complexity of the human genome using single-molecule sequencing

OPENALEX - Publications

Mark Chaisson John Huddleston Megan Y. Dennis Peter H. Sudmant Maika Malig and 10 more

10.1038/nature13907 article EN Nature 2014-11-10

Haplotype-resolved diverse human genomes and integrated analysis of structural variation

OPENALEX - Publications

Peter Ebert Peter A. Audano Qihui Zhu Bernardo Rodríguez–Martín David Porubský and 60 more

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent-child trio data. We present 64 assembled haplotypes from 32 diverse genomes. These highly contiguous haplotype assemblies (average minimum contig length needed to cover 50% genome: 26 million base pairs) integrate all forms genetic variation, even across complex loci. identified 107,590 structural variants (SVs), which 68% were not...

10.1126/science.abf7117 article EN Science 2021-02-25

Short read fragment assembly of bacterial genomes

OPENALEX - Publications

Mark Chaisson Pavel A. Pevzner

In the last year, high-throughput sequencing technologies have progressed from proof-of-concept to production quality. While these methods produce high-quality reads, they yet reads comparable in length Sanger-based sequencing. Current fragment assembly algorithms been implemented and optimized for mate-paired thus do not perform well on short produced by read technologies. We present a new Eulerian assembler that generates nearly optimal assemblies of bacterial genomes describe an approach...

10.1101/gr.7088808 article EN cc-by-nc Genome Research 2007-12-14

Long-read sequence assembly of the gorilla genome

OPENALEX - Publications

David Gordon John Huddleston Mark Chaisson C. Hill Zev Kronenberg and 15 more

Improving on the gorilla genome Access to complete, high-quality genomes of nonhuman primates will also help us understand human biology. Gordon et al. used long-read sequencing technology improve data our close relative gorilla. Sequencing from a single individual decreased assembly fragmentation and recovered previously missed genes noncoding loci. Mapping short-read sequences additional gorillas helped reconstruct “pan” sequence documenting genetic variation. Comparison with revealed...

10.1126/science.aae0344 article EN Science 2016-03-31

Discovery and genotyping of structural variation from long-read haploid genome sequence data

OPENALEX - Publications

John Huddleston Mark Chaisson Karyn Meltz Steinberg Wes Warren Kendra Hoekzema and 11 more

In an effort to more fully understand the full spectrum of human genetic variation, we generated deep single-molecule, real-time (SMRT) sequencing data from two haploid genomes. By using assembly-based approach (SMRT-SV), systematically assessed each genome independently for structural variants (SVs) and indels resolving sequence structure 461,553 2 bp 28 kbp in length. We find that >89% these have been missed as part analysis 1000 Genomes Project even after adjusting common (MAF > 1%)....

10.1101/gr.214007.116 article EN cc-by-nc Genome Research 2016-11-28

High-resolution comparative analysis of great ape genomes

OPENALEX - Publications

Zev Kronenberg Ian T. Fiddes David Gordon Shwetha C. Murali Stuart Cantsilieris and 39 more

A spotlight on great ape genomes Most nonhuman primate generated to date have been “humanized” owing their many gaps and the reliance guidance by reference human genome. To remove this humanizing effect, Kronenberg et al. assembled long-read of a chimpanzee, an orangutan, two humans compared them with previously gorilla This analysis recognized genomic structural variation specific particular lineages. Comparisons between chimpanzee cerebral organoids showed down-regulation expression genes...

10.1126/science.aar6343 article EN Science 2018-06-07

A robust benchmark for detection of germline large deletions and insertions

OPENALEX - Publications

Justin M. Zook Nancy F. Hansen Nathan D. Olson Lesley M. Chapman James C. Mullikin and 45 more

10.1038/s41587-020-0538-8 article EN Nature Biotechnology 2020-06-15

Assembly of long error-prone reads using de Bruijn graphs

OPENALEX - Publications

Yu Lin Jeffrey Yuan Mikhail Kolmogorov Max W. Shen Mark Chaisson and 1 more

The recent breakthroughs in assembling long error-prone reads were based on the overlap-layout-consensus (OLC) approach and did not utilize strengths of alternative de Bruijn graph to genome assembly. Moreover, these studies often assume that applications are limited short accurate OLC is only practical paradigm for reads. We show how generalize graphs describe ABruijn assembler, which combines approaches results reconstructions.

10.1073/pnas.1604560113 article EN Proceedings of the National Academy of Sciences 2016-12-12

Reconstructing complex regions of genomes using long-read sequencing technology

OPENALEX - Publications

John Huddleston Swati Ranade Maika Malig Francesca Antonacci Mark Chaisson and 9 more

Obtaining high-quality sequence continuity of complex regions recent segmental duplication remains one the major challenges finishing genome assemblies. In human and mouse genomes, this was achieved by targeting large-insert clones using costly laborious capillary-based sequencing approaches. Sanger shotgun clone inserts, however, has now been largely abandoned, leaving most these unresolved in newer assemblies generated primarily next-generation hybrid Here we show that it is possible to...

10.1101/gr.168450.113 article EN cc-by-nc Genome Research 2014-01-13

Long-read sequence and assembly of segmental duplications

OPENALEX - Publications

Mitchell R. Vollger Philip C. Dishuck Melanie Sorensen AnneMarie E. Welch Vy Dang and 5 more

10.1038/s41592-018-0236-3 article EN Nature Methods 2018-12-07

Fully phased human genome assembly without parental data using single-cell strand sequencing and long reads

OPENALEX - Publications

David Porubský Peter Ebert Peter A. Audano Mitchell R. Vollger William T. Harvey and 16 more

Abstract Human genomes are typically assembled as consensus sequences that lack information on parental haplotypes. Here we describe a reference-free workflow for diploid de novo genome assembly combines the chromosome-wide phasing and scaffolding capabilities of single-cell strand sequencing 1,2 with continuous long-read or high-fidelity 3 data. Employing this strategy, produced completely phased each haplotype an individual Puerto Rican descent (HG00733) in absence The assemblies accurate...

10.1038/s41587-020-0719-5 article EN cc-by Nature Biotechnology 2020-12-07

Semi-automated assembly of high-quality diploid human reference genomes

OPENALEX - Publications

Erich D. Jarvis Giulio Formenti Arang Rhie Andrea Guarracino Chentao Yang and 78 more

Abstract The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society 1,2 . However, it still many gaps and errors, does not represent biological genome as is blend multiple individuals 3,4 Recently, telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but derived from hydatidiform mole cell line nearly homozygous 5 To address these limitations, Human Pangenome...

10.1038/s41586-022-05325-5 article EN cc-by Nature 2022-10-19

Pangenome graph construction from genome alignments with Minigraph-Cactus

OPENALEX - Publications

Glenn Hickey Jean Monlong Jana Ebler Adam M. Novak Jordan M. Eizenga and 95 more

10.1038/s41587-023-01793-w article EN Nature Biotechnology 2023-05-10

Recombination between heterologous human acrocentric chromosomes

OPENALEX - Publications

Andrea Guarracino Silvia Buonaiuto Leonardo Gomes de Lima Tamara Potapova Arang Rhie and 95 more

Abstract The short arms of the human acrocentric chromosomes 13, 14, 15, 21 and 22 (SAACs) share large homologous regions, including ribosomal DNA repeats extended segmental duplications 1,2 . Although resolution these regions in first complete assembly a genome—the Telomere-to-Telomere Consortium’s CHM13 (T2T-CHM13)—provided model their homology 3 , it remained unclear whether patterns were ancestral or maintained by ongoing recombination exchange. Here we show that contain...

10.1038/s41586-023-05976-y article EN cc-by Nature 2023-05-10

Scalable Nanopore sequencing of human genomes provides a comprehensive view of haplotype-resolved variation and methylation

OPENALEX - Publications

Mikhail Kolmogorov Kimberley Billingsley Mira Mastoras Melissa Meredith Jean Monlong and 25 more

10.1038/s41592-023-01993-x article EN Nature Methods 2023-09-14

A Draft Human Pangenome Reference

OPENALEX - Publications

Wen‐Wei Liao Mobin Asri Jana Ebler Daniel Doerr Marina Haukness and 51 more

Abstract The Human Pangenome Reference Consortium (HPRC) presents a first draft human pangenome reference. contains 47 phased, diploid assemblies from cohort of genetically diverse individuals. These cover more than 99% the expected sequence and are accurate at structural base-pair levels. Based on alignments assemblies, we generated that captures known variants haplotypes, reveals novel alleles structurally complex loci, adds 119 million base pairs euchromatic polymorphic 1,529 gene...

10.1101/2022.07.09.499321 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2022-07-09

Increased mutation and gene conversion within human segmental duplications

OPENALEX - Publications

Mitchell R. Vollger Philip C. Dishuck William T. Harvey William S. DeWitt Xavi Guitart and 95 more

Abstract Single-nucleotide variants (SNVs) in segmental duplications (SDs) have not been systematically assessed because of the limitations mapping short-read sequencing data 1,2 . Here we constructed 1:1 unambiguous alignments spanning high-identity SDs across 102 human haplotypes and compared pattern SNVs between unique duplicated regions 3,4 We find that are elevated 60% to estimate at least 23% this increase is due interlocus gene conversion (IGC) with up 4.3 megabase pairs SD sequence...

10.1038/s41586-023-05895-y article EN cc-by Nature 2023-05-10

Coming Soon ...