Marina Haukness

ORCID: 0000-0001-9991-8089
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genomics and Phylogenetic Studies
  • Chromosomal and Genetic Variations
  • Genomic variations and chromosomal abnormalities
  • Gene expression and cancer classification
  • Genomics and Chromatin Dynamics
  • Machine Learning in Bioinformatics
  • Bioinformatics and Genomic Networks
  • CRISPR and Genetic Engineering
  • Genomics and Rare Diseases
  • Evolution and Genetic Dynamics
  • Genetic diversity and population structure
  • Algorithms and Data Compression
  • Advanced biosensing and bioanalysis techniques
  • Animal testing and alternatives
  • Genome Rearrangement Algorithms
  • Genetic Mapping and Diversity in Plants and Animals
  • Genetic Associations and Epidemiology
  • Primate Behavior and Ecology
  • Cancer Genomics and Diagnostics
  • RNA and protein synthesis mechanisms
  • Genetic factors in colorectal cancer

University of California, Santa Cruz
2018-2024

Genomics (United Kingdom)
2019

Sergey Nurk Sergey Koren Arang Rhie Mikko Rautiainen Andrey V. Bzikadze and 95 more Alla Mikheenko Mitchell R. Vollger Nicolas Altemose Lev Uralsky Ariel Gershman Sergey Aganezov Savannah J. Hoyt Mark Diekhans Glennis A. Logsdon Michael Alonge Stylianos E. Antonarakis Matthew Borchers Gerard G. Bouffard Shelise Brooks Gina V. Caldas Nae-Chyun Chen Haoyu Cheng Chen-Shan Chin William Chow Leonardo Gomes de Lima Philip C. Dishuck Richard Durbin Tatiana Dvorkina Ian T. Fiddes Giulio Formenti Robert S. Fulton Arkarachai Fungtammasan Erik Garrison Patrick G. S. Grady Tina A. Graves-Lindsay Ira M. Hall Nancy F. Hansen Gabrielle A. Hartley Marina Haukness Kerstin Howe Michael W. Hunkapiller Chirag Jain Miten Jain Erich D. Jarvis Peter Kerpedjiev Melanie Kirsche Mikhail Kolmogorov Jonas Korlach Milinn Kremitzki Heng Li Valerie V. Maduro Tobias Marschall Ann M. Mc Cartney Jennifer McDaniel Danny E. Miller James C. Mullikin Eugene W. Myers Nathan D. Olson Benedict Paten Paul Peluso Pavel A. Pevzner David Porubský Tamara Potapova Е. И. Рогаев Jeffrey Rosenfeld Steven L. Salzberg Valérie Schneider Fritz J. Sedlazeck Kishwar Shafin Colin J. Shew Alaina Shumate Ying Sims Arian F. A. Smit Daniela C. Soto Ivan Sović Jessica M. Storer Aaron Streets Beth A. Sullivan Françoise Thibaud‐Nissen James Torrance Justin Wagner Brian P. Walenz Aaron M. Wenger Jonathan Wood Chunlin Xiao Stephanie M. Yan Alice Young Samantha Zarate Urvashi Surti Rajiv C. McCoy Megan Y. Dennis Ivan A. Alexandrov Jennifer L. Gerton Rachel J. O’Neill Winston Timp Justin M. Zook Michael C. Schatz Evan E. Eichler Karen H. Miga Adam M. Phillippy

Since its initial release in 2000, the human reference genome has covered only euchromatic fraction of genome, leaving important heterochromatic regions unfinished. Addressing remaining 8% Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion–base pair sequence T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors prior references, and introduces nearly 200 million base pairs containing 1956 gene predictions, 99 which are predicted to be...

10.1126/science.abj6987 article EN Science 2022-03-31
Wen‐Wei Liao Mobin Asri Jana Ebler Daniel Doerr Marina Haukness and 95 more Glenn Hickey Shuangjia Lu Julian Lucas Jean Monlong Haley Abel Silvia Buonaiuto Xian Chang Haoyu Cheng Justin Chu Vincenza Colonna Jordan M. Eizenga Xiaowen Feng Christian Fischer Robert S. Fulton Shilpa Garg Cristian Groza Andrea Guarracino William T. Harvey Simon Heumos Kerstin Howe Miten Jain Tsung-Yu Lu Charles Markello Fergal J. Martin Matthew W. Mitchell Katherine M. Munson Moses Njagi Mwaniki Adam M. Novak Hugh E. Olsen Trevor Pesout David Porubský Pjotr Prins Jonas A. Sibbesen Jouni Sirén Chad Tomlinson Flavia Villani Mitchell R. Vollger Lucinda Antonacci-Fulton Gunjan Baid Carl Baker Anastasiya Belyaeva Konstantinos Billis Andrew Carroll Pi-Chuan Chang Sarah Cody Daniel E. Cook Robert Cook‐Deegan Omar E. Cornejo Mark Diekhans Peter Ebert Susan Fairley Olivier Fédrigo Adam L. Felsenfeld Giulio Formenti Adam Frankish Yan Gao Nanibaa’ A. Garrison Carlos García Girón Richard E. Green Leanne Haggerty Kendra Hoekzema Thibaut Hourlier Hanlee P. Ji Eimear E. Kenny Barbara A. Koenig Alexey Kolesnikov Jan O. Korbel Jennifer Kordosky Sergey Koren HoJoon Lee Alexandra P. Lewis Hugo Magalhães Santiago Marco‐Sola Pierre Marijon Ann M. Mc Cartney Jennifer McDaniel Jacquelyn Mountcastle Maria Nattestad Sergey Nurk Nathan D. Olson Alice B. Popejoy Daniela Puiu Mikko Rautiainen Allison Regier Arang Rhie Samuel Sacco Ashley D. Sanders Valérie Schneider Baergen I. Schultz Kishwar Shafin Michael W. Smith Heidi J. Sofia Ahmad Abou Tayoun Françoise Thibaud‐Nissen Francesca Floriana Tricomi

Abstract Here the Human Pangenome Reference Consortium presents a first draft of human pangenome reference. The contains 47 phased, diploid assemblies from cohort genetically diverse individuals 1 . These cover more than 99% expected sequence in each genome and are accurate at structural base pair levels. Based on alignments assemblies, we generate that captures known variants haplotypes reveals new alleles structurally complex loci. We also add 119 million pairs euchromatic polymorphic...

10.1038/s41586-023-05896-x article EN cc-by Nature 2023-05-10

Abstract De novo assembly of a human genome using nanopore long-read sequences has been reported, but it used more than 150,000 CPU hours and weeks wall-clock time. To enable rapid assembly, we present Shasta, de assembler, polishing algorithms named MarginPolish HELEN. Using single PromethION sequencer our toolkit, assembled 11 highly contiguous genomes in 9 d. We achieved roughly 63× coverage, 42-kb read N50 values 6.5× coverage reads >100 kb three flow cells per sample. Shasta produced...

10.1038/s41587-020-0503-6 article EN cc-by Nature Biotechnology 2020-05-04

Existing human genome assemblies have almost entirely excluded repetitive sequences within and near centromeres, limiting our understanding of their organization, evolution, functions, which include facilitating proper chromosome segregation. Now, a complete, telomere-to-telomere assembly (T2T-CHM13) has enabled us to comprehensively characterize pericentromeric centromeric repeats, constitute 6.2% the (189.9 megabases). Detailed maps these regions revealed multimegabase structural...

10.1126/science.abl4178 article EN Science 2022-03-31

10.1038/s41586-023-06457-y article EN Nature 2023-08-23

Abstract Human genomes are typically assembled as consensus sequences that lack information on parental haplotypes. Here we describe a reference-free workflow for diploid de novo genome assembly combines the chromosome-wide phasing and scaffolding capabilities of single-cell strand sequencing 1,2 with continuous long-read or high-fidelity 3 data. Employing this strategy, produced completely phased each haplotype an individual Puerto Rican descent (HG00733) in absence The assemblies accurate...

10.1038/s41587-020-0719-5 article EN cc-by Nature Biotechnology 2020-12-07
Sergey Nurk Sergey Koren Arang Rhie Mikko Rautiainen Andrey V. Bzikadze and 94 more Alla Mikheenko Mitchell R. Vollger Nicolas Altemose Lev Uralsky Ariel Gershman Sergey Aganezov Savannah J. Hoyt Mark Diekhans Glennis A. Logsdon Michael Alonge Stylianos E. Antonarakis Matthew Borchers Gerard G. Bouffard Shelise Brooks Gina V. Caldas Haoyu Cheng Chen-Shan Chin William Chow Leonardo Gomes de Lima Philip C. Dishuck Richard Durbin Tatiana Dvorkina Ian T. Fiddes Giulio Formenti Robert S. Fulton Arkarachai Fungtammasan Erik Garrison Patrick G. S. Grady Tina A. Graves-Lindsay Ira M. Hall Nancy F. Hansen Gabrielle A. Hartley Marina Haukness Kerstin Howe Michael W. Hunkapiller Chirag Jain Miten Jain Erich D. Jarvis Peter Kerpedjiev Melanie Kirsche Mikhail Kolmogorov Jonas Korlach Milinn Kremitzki Heng Li Valerie V. Maduro Tobias Marschall Ann M. Mc Cartney Jennifer McDaniel Danny E. Miller James C. Mullikin Eugene W. Myers Nathan D. Olson Benedict Paten Paul Peluso Pavel A. Pevzner David Porubský Tamara Potapova Е. И. Рогаев Jeffrey Rosenfeld Steven L. Salzberg Valérie Schneider Fritz J. Sedlazeck Kishwar Shafin Colin J. Shew Alaina Shumate Yumi Sims Arian F. A. Smit Daniela C. Soto Ivan Sović Jessica M. Storer Aaron Streets Beth A. Sullivan Françoise Thibaud‐Nissen James Torrance Justin Wagner Brian P. Walenz Aaron M. Wenger Jonathan Wood Chunlin Xiao Stephanie M. Yan Alice Young Samantha Zarate Urvashi Surti Rajiv C. McCoy Megan Y. Dennis Ivan A. Alexandrov Jennifer L. Gerton Rachel J. O’Neill Winston Timp Justin M. Zook Michael C. Schatz Evan E. Eichler Karen H. Miga Adam M. Phillippy

Abstract In 2001, Celera Genomics and the International Human Genome Sequencing Consortium published their initial drafts of human genome, which revolutionized field genomics. While these updates that followed effectively covered euchromatic fraction heterochromatin many other complex regions were left unfinished or erroneous. Addressing this remaining 8% Telomere-to-Telomere (T2T) has finished first truly complete 3.055 billion base pair (bp) sequence a representing largest improvement to...

10.1101/2021.05.26.445798 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2021-05-27

A high-quality rhesus macaque genome Genome technology has improved substantially since the first full organismal genomes were generated. Applying new technology, Warren et al. refined of macaque, a model nonhuman primate. Long-read and other recent advances in sequencing applied to generate with far fewer gaps helped refine locations numbers repetitive elements. Furthermore, authors performed resequencing among populations identify genetic variability macaque. Thus, previously incomplete...

10.1126/science.abc6617 article EN Science 2020-12-18

Abstract The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society 1,2 . However, it still many gaps and errors, does not represent biological genome as is blend multiple individuals 3,4 Recently, telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but derived from hydatidiform mole cell line nearly homozygous 5 To address these limitations, Human Pangenome...

10.1038/s41586-022-05325-5 article EN cc-by Nature 2022-10-19
Glenn Hickey Jean Monlong Jana Ebler Adam M. Novak Jordan M. Eizenga and 95 more Yan Gao Haley Abel Lucinda Antonacci-Fulton Mobin Asri Gunjan Baid Carl Baker Anastasiya Belyaeva Konstantinos Billis Guillaume Bourque Silvia Buonaiuto Andrew Carroll Mark Chaisson Pi-Chuan Chang Xian Chang Haoyu Cheng Justin Chu Sarah Cody Vincenza Colonna Daniel E. Cook Robert Cook‐Deegan Omar E. Cornejo Mark Diekhans Daniel Doerr Peter Ebert Jana Ebler Evan E. Eichler Susan Fairley Olivier Fédrigo Adam L. Felsenfeld Xiaowen Feng Christian Fischer Paul Flicek Giulio Formenti Adam Frankish Robert S. Fulton Shilpa Garg Erik Garrison Nanibaa’ A. Garrison Carlos García Girón Richard E. Green Cristian Groza Andrea Guarracino Leanne Haggerty Ira M. Hall William T. Harvey Marina Haukness David Haussler Simon Heumos Kendra Hoekzema Thibaut Hourlier Kerstin Howe Miten Jain Erich D. Jarvis Hanlee P. Ji Eimear E. Kenny Barbara A. Koenig Alexey Kolesnikov Jan O. Korbel Jennifer Kordosky Sergey Koren HoJoon Lee Alexandra P. Lewis Wen‐Wei Liao Shuangjia Lu Tsung-Yu Lu Julian Lucas Hugo Magalhães Santiago Marco‐Sola Pierre Marijon Charles Markello Tobias Marschall Fergal J. Martin Ann M. Mc Cartney Jennifer McDaniel Karen H. Miga Matthew W. Mitchell Jacquelyn Mountcastle Katherine M. Munson Moses Njagi Mwaniki Maria Nattestad Sergey Nurk Hugh E. Olsen Nathan D. Olson Trevor Pesout Adam M. Phillippy Alice B. Popejoy David Porubský Pjotr Prins Daniela Puiu Mikko Rautiainen Allison Regier Arang Rhie Samuel Sacco Ashley D. Sanders Valérie Schneider

10.1038/s41587-023-01793-w article EN Nature Biotechnology 2023-05-10
Andrea Guarracino Silvia Buonaiuto Leonardo Gomes de Lima Tamara Potapova Arang Rhie and 95 more Sergey Koren Boris Rubinstein Christian Fischer Haley Abel Lucinda Antonacci-Fulton Mobin Asri Gunjan Baid Carl Baker Anastasiya Belyaeva Konstantinos Billis Guillaume Bourque Andrew Carroll Mark Chaisson Pi-Chuan Chang Xian Chang Haoyu Cheng Justin Chu Sarah Cody Daniel E. Cook Robert Cook‐Deegan Omar E. Cornejo Mark Diekhans Daniel Doerr Peter Ebert Jana Ebler Evan E. Eichler Jordan M. Eizenga Susan Fairley Olivier Fédrigo Adam L. Felsenfeld Xiaowen Feng Paul Flicek Giulio Formenti Adam Frankish Robert S. Fulton Yan Gao Shilpa Garg Nanibaa’ A. Garrison Carlos García Girón Richard E. Green Cristian Groza Leanne Haggerty Ira M. Hall William T. Harvey Marina Haukness David Haussler Simon Heumos Glenn Hickey Kendra Hoekzema Thibaut Hourlier Kerstin Howe Miten Jain Erich D. Jarvis Hanlee P. Ji Eimear E. Kenny Barbara A. Koenig Alexey Kolesnikov Jan O. Korbel Jennifer Kordosky HoJoon Lee Alexandra P. Lewis Heng Li Wen‐Wei Liao Shuangjia Lu Tsung-Yu Lu Julian Lucas Hugo Magalhães Santiago Marco‐Sola Pierre Marijon Charles Markello Tobias Marschall Fergal J. Martin Ann M. Mc Cartney Jennifer McDaniel Karen H. Miga Matthew W. Mitchell Jean Monlong Jacquelyn Mountcastle Katherine M. Munson Moses Njagi Mwaniki Maria Nattestad Adam M. Novak Sergey Nurk Hugh E. Olsen Nathan D. Olson Benedict Paten Trevor Pesout Alice B. Popejoy David Porubský Pjotr Prins Daniela Puiu Mikko Rautiainen Allison Regier Samuel Sacco Ashley D. Sanders

Abstract The short arms of the human acrocentric chromosomes 13, 14, 15, 21 and 22 (SAACs) share large homologous regions, including ribosomal DNA repeats extended segmental duplications 1,2 . Although resolution these regions in first complete assembly a genome—the Telomere-to-Telomere Consortium’s CHM13 (T2T-CHM13)—provided model their homology 3 , it remained unclear whether patterns were ancestral or maintained by ongoing recombination exchange. Here we show that contain...

10.1038/s41586-023-05976-y article EN cc-by Nature 2023-05-10

Abstract The Human Pangenome Reference Consortium (HPRC) presents a first draft human pangenome reference. contains 47 phased, diploid assemblies from cohort of genetically diverse individuals. These cover more than 99% the expected sequence and are accurate at structural base-pair levels. Based on alignments assemblies, we generated that captures known variants haplotypes, reveals novel alleles structurally complex loci, adds 119 million base pairs euchromatic polymorphic 1,529 gene...

10.1101/2022.07.09.499321 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2022-07-09
Mitchell R. Vollger Philip C. Dishuck William T. Harvey William S. DeWitt Xavi Guitart and 95 more Michael E. Goldberg Allison N. Rozanski Julian Lucas Mobin Asri Haley Abel Lucinda Antonacci-Fulton Gunjan Baid Carl Baker Anastasiya Belyaeva Konstantinos Billis Guillaume Bourque Silvia Buonaiuto Andrew Carroll Mark Chaisson Pi-Chuan Chang Xian Chang Haoyu Cheng Justin Chu Sarah Cody Vincenza Colonna Daniel E. Cook Robert Cook‐Deegan Omar E. Cornejo Mark Diekhans Daniel Doerr Peter Ebert Jana Ebler Jordan M. Eizenga Susan Fairley Olivier Fédrigo Adam L. Felsenfeld Xiaowen Feng Christian Fischer Paul Flicek Giulio Formenti Adam Frankish Robert S. Fulton Yan Gao Shilpa Garg Erik Garrison Nanibaa’ A. Garrison Carlos García Girón Richard E. Green Cristian Groza Andrea Guarracino Leanne Haggerty Ira M. Hall Marina Haukness David Haussler Simon Heumos Glenn Hickey Thibaut Hourlier Kerstin Howe Miten Jain Erich D. Jarvis Hanlee P. Ji Eimear E. Kenny Barbara A. Koenig Alexey Kolesnikov Jan O. Korbel Jennifer Kordosky Sergey Koren HoJoon Lee Heng Li Wen‐Wei Liao Shuangjia Lu Tsung-Yu Lu Julian Lucas Hugo Magalhães Santiago Marco‐Sola Pierre Marijon Charles Markello Tobias Marschall Fergal J. Martin Ann M. Mc Cartney Jennifer McDaniel Karen H. Miga Matthew W. Mitchell Jean Monlong Jacquelyn Mountcastle Moses Njagi Mwaniki Maria Nattestad Adam M. Novak Sergey Nurk Hugh E. Olsen Nathan D. Olson Benedict Paten Trevor Pesout Adam M. Phillippy Alice B. Popejoy Pjotr Prins Daniela Puiu Mikko Rautiainen Allison Regier Arang Rhie

Abstract Single-nucleotide variants (SNVs) in segmental duplications (SDs) have not been systematically assessed because of the limitations mapping short-read sequencing data 1,2 . Here we constructed 1:1 unambiguous alignments spanning high-identity SDs across 102 human haplotypes and compared pattern SNVs between unique duplicated regions 3,4 We find that are elevated 60% to estimate at least 23% this increase is due interlocus gene conversion (IGC) with up 4.3 megabase pairs SD sequence...

10.1038/s41586-023-05895-y article EN cc-by Nature 2023-05-10

We sequenced and assembled using multiple long-read sequencing technologies the genomes of chimpanzee, bonobo, gorilla, orangutan, gibbon, macaque, owl monkey, marmoset. identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes 136,932 regulatory elements, including most complete set human-specific differences. estimate that 819.47 Mbp or ∼27% genome has been affected by SVs across primate evolution. identify 1,607 structurally divergent...

10.1016/j.cell.2024.01.052 article EN cc-by Cell 2024-02-29

Abstract The divergence of chimpanzee and bonobo provides one the few examples recent hominid speciation 1,2 . Here we describe a fully annotated, high-quality genome assembly, which was constructed without guidance from reference genomes by applying multiplatform genomics approach. We generate assembly in more than 98% genes are completely annotated 99% gaps closed, including resolution about half segmental duplications almost all full-length mobile elements. compare to those other great...

10.1038/s41586-021-03519-x article EN cc-by Nature 2021-05-05

Current genotyping approaches for single-nucleotide variations rely on short, accurate reads from second-generation sequencing devices. Presently, third-generation platforms are rapidly becoming more widespread, yet leveraging their long but error-prone lacking. Here, we introduce a novel statistical framework the joint inference of haplotypes and genotypes noisy reads, which term diplotyping. Our technique takes full advantage linkage information provided by reads. We validate hundreds...

10.1186/s13059-019-1709-0 article EN cc-by Genome biology 2019-06-03

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure including long palindromes, tandem repeats, segmental duplications 1–3 . As a result, more than half the is missing from GRCh38 reference it remains last be finished 4, 5 Here, Telomere-to-Telomere (T2T) consortium presents complete 62,460,029 base pair HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y adds over 30 million pairs reference, revealing ampliconic...

10.1101/2022.12.01.518724 preprint EN bioRxiv (Cold Spring Harbor Laboratory) 2022-12-01

Brief summary: The Human Pangenome Reference Consortium reports a first draft of the human pangenome reference due to replace existing GRCh38 (1, 2). It is an updated, high-quality, graph-based, telomere-to-telomere representation global genomic diversity including common variants (single-nucleotide variants, structural and functional elements).

10.1530/ey.20.12.1 article EN Yearbook of pediatric endocrinology 2023-09-08

Abstract Present workflows for producing human genome assemblies from long-read technologies have cost and production time bottlenecks that prohibit efficient scaling to large cohorts. We demonstrate an optimized PromethION nanopore sequencing method eleven genomes. The sequencing, performed on one machine in nine days, achieved average 63x coverage, 42 Kb read N50, 90% median identity 6.5x coverage 100 Kb+ reads using just three flow cells per sample. To assemble these data we introduce new...

10.1101/715722 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2019-07-26

Abstract The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has greatly benefited society 1, 2 . However, it still many gaps and errors, does not represent biological genome since is blend multiple individuals 3, 4 Recently, telomere-to-telomere CHM13, was generated with the latest long-read technologies, but derived from hydatidiform mole cell line duplicate thus nearly homozygous 5 To address these limitations, Human...

10.1101/2022.03.06.483034 preprint EN cc-by bioRxiv (Cold Spring Harbor Laboratory) 2022-03-06

The prevailing genome assembly paradigm is to produce consensus sequences that “collapse” parental haplotypes into a sequence. Here, we leverage the chromosome-wide phasing and scaffolding capabilities of single-cell strand sequencing (Strand-seq) 1,2 combine them with high-fidelity (HiFi) long reads 3 , in novel reference-free workflow for diploid de novo assembly. Employing this strategy, completely phased assemblies separately each haplotype single individual Puerto Rican origin (HG00733)...

10.1101/855049 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2019-11-26

ABSTRACT To better understand the pattern of primate genome structural variation, we sequenced and assembled using multiple long-read sequencing technologies genomes eight nonhuman species, including New World monkeys (owl monkey marmoset), Old (macaque), Asian apes (orangutan gibbon), African ape lineages (gorilla, bonobo, chimpanzee). Compared to human genome, identified 1,338,997 lineage-specific fixed variants (SVs) disrupting 1,561 protein-coding genes 136,932 regulatory elements, most...

10.1101/2023.03.07.531415 preprint EN cc-by-nc-nd bioRxiv (Cold Spring Harbor Laboratory) 2023-03-07
HoJoon Lee Stephanie Greer Dmitri S. Pavlichin Bo Zhou Alexander E. Urban and 95 more Tsachy Weissman Hanlee P. Ji Wen‐Wei Liao Mobin Asri Jana Ebler Daniel Doerr Marina Haukness Glenn Hickey Shuangjia Lu Julian Lucas Jean Monlong Haley Abel Silvia Buonaiuto Xian Chang Haoyu Cheng Justin Chu Vincenza Colonna Jordan M. Eizenga Xiaowen Feng Christian Fischer Robert S. Fulton Shilpa Garg Cristian Groza Andrea Guarracino William T. Harvey Simon Heumos Kerstin Howe Miten Jain Tsung-Yu Lu Charles Markello Fergal J. Martin Matthew W. Mitchell Katherine M. Munson Moses Njagi Mwaniki Adam M. Novak Hugh E. Olsen Trevor Pesout David Porubský Pjotr Prins Jonas A. Sibbesen Chad Tomlinson Flavia Villani Mitchell R. Vollger Lucinda Antonacci-Fulton Gunjan Baid Carl Baker Anastasiya Belyaeva Konstantinos Billis Andrew Carroll Pi-Chuan Chang Sarah Cody Daniel E. Cook Omar E. Cornejo Mark Diekhans Peter Ebert Susan Fairley Olivier Fédrigo Adam L. Felsenfeld Giulio Formenti Adam Frankish Yan Gao Carlos García Girón Richard E. Green Leanne Haggerty Kendra Hoekzema Thibaut Hourlier Hanlee P. Ji Alexey Kolesnikov Jan O. Korbel Jennifer Kordosky HoJoon Lee Alexandra P. Lewis Hugo Magalhães Santiago Marco‐Sola Pierre Marijon Jennifer McDaniel Jacquelyn Mountcastle Maria Nattestad Nathan D. Olson Daniela Puiu Allison Regier Arang Rhie Samuel Sacco Ashley D. Sanders Valérie Schneider Baergen I. Schultz Kishwar Shafin Jouni Sirén Michael W. Smith Heidi J. Sofia Ahmad Abou Tayoun Françoise Thibaud‐Nissen Francesca Floriana Tricomi Justin Wagner Jonathan Wood

The human pangenome, a new reference sequence, addresses many limitations of the current GRCh38 reference. first release is based on 94 high-quality haploid assemblies from individuals with diverse backgrounds. We employed k-mer indexing strategy for comparative analysis across multiple assemblies, including pangenome reference, GRCh38, and CHM13, telomere-to-telomere assembly. Our approach enabled us to identify valuable collection universally conserved sequences all referred as...

10.1016/j.crmeth.2023.100543 article EN cc-by-nc-nd Cell Reports Methods 2023-08-01

Abstract Existing human genome assemblies have almost entirely excluded highly repetitive sequences within and near centromeres, limiting our understanding of their sequence, evolution, essential role in chromosome segregation. Here, we present an extensive study newly assembled peri/centromeric representing 6.2% (189.9 Mb) the first complete, telomere-to-telomere assembly (T2T-CHM13). We discovered novel patterns repeat organization, variation, evolution at both large small length scales....

10.1101/2021.07.12.452052 preprint EN public-domain bioRxiv (Cold Spring Harbor Laboratory) 2021-07-13

Abstract The Javan gibbon, Hylobates moloch, is an endangered gibbon species restricted to the forest remnants of western and central Java, Indonesia, one rarest Hylobatidae family. Hylobatids consist 4 genera (Holoock, Hylobates, Symphalangus, Nomascus) that are characterized by different numbers chromosomes, ranging from 38 52. underlying cause this karyotype plasticity not entirely understood, at least in part, due limited availability genomic data. Here we present first scaffold-level...

10.1093/jhered/esac043 article EN cc-by-nc Journal of Heredity 2022-09-23
Coming Soon ...