Adam M. Phillippy

ORCID: 0000-0003-2983-8934
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Genomics and Phylogenetic Studies
  • Chromosomal and Genetic Variations
  • RNA and protein synthesis mechanisms
  • Bacteriophages and microbial interactions
  • CRISPR and Genetic Engineering
  • Plant Virus Research Studies
  • Genomic variations and chromosomal abnormalities
  • Genomics and Chromatin Dynamics
  • Genetic diversity and population structure
  • Genetics, Bioinformatics, and Biomedical Research
  • Mosquito-borne diseases and control
  • Genomics and Rare Diseases
  • Vibrio bacteria research studies
  • Genetic Mapping and Diversity in Plants and Animals
  • Antibiotic Resistance in Bacteria
  • Microbial infections and disease research
  • Algorithms and Data Compression
  • Insect symbiosis and bacterial influences
  • Genetic and Clinical Aspects of Sex Determination and Chromosomal Abnormalities
  • Gene expression and cancer classification
  • Microbial Community Ecology and Physiology
  • vaccines and immunoinformatics approaches
  • RNA modifications and cancer
  • Genetic and phenotypic traits in livestock
  • Bioinformatics and Genomic Networks

National Institutes of Health
2016-2025

National Human Genome Research Institute
2016-2025

ORCID
2021

University of Maryland, College Park
2005-2014

Battelle
2011

University of Maryland, Baltimore
2005-2011

Technische Universität Berlin
2005

Johns Hopkins University
2005

Biotechnology Institute
2005

George Washington University
2005

Long-read single-molecule sequencing has revolutionized de novo genome assembly and enabled the automated reconstruction of reference-quality genomes. However, given relatively high error rates such technologies, efficient accurate large repeats closely related haplotypes remains challenging. We address these issues with Canu, a successor Celera Assembler that is specifically designed for noisy sequences. Canu introduces support nanopore sequencing, halves depth-of-coverage requirements,...

10.1101/gr.215087.116 article EN cc-by-nc Genome Research 2017-03-15

Abstract The newest version of MUMmer easily handles comparisons large eukaryotic genomes at varying evolutionary distances, as demonstrated by applications to multiple genomes. Two new graphical viewing tools provide alternative ways analyze genome alignments. system is the first be released open-source software. This allows other developers contribute code base and freely redistribute code. sources are available http://www.tigr.org/software/mummer .

10.1186/gb-2004-5-2-r12 article EN cc-by Genome biology 2004-01-30

A fundamental question in microbiology is whether there continuum of genetic diversity among genomes, or clear species boundaries prevail instead. Whole-genome similarity metrics such as Average Nucleotide Identity (ANI) help address this by facilitating high resolution taxonomic analysis thousands genomes from diverse phylogenetic lineages. To scale to available and beyond, we present FastANI, a new method estimate ANI using alignment-free approximate sequence mapping. FastANI accurate for...

10.1038/s41467-018-07641-9 article EN cc-by Nature Communications 2018-11-26

Mash extends the MinHash dimensionality-reduction technique to include a pairwise mutation distance and P value significance test, enabling efficient clustering search of massive sequence collections. reduces large sequences sets small, representative sketches, from which global distances can be rapidly estimated. We demonstrate several use cases, including all 54,118 NCBI RefSeq genomes in 33 CPU h; real-time database using assembled or unassembled Illumina, Pacific Biosciences, Oxford...

10.1186/s13059-016-0997-x article EN cc-by Genome biology 2016-06-20
Sergey Nurk Sergey Koren Arang Rhie Mikko Rautiainen Andrey V. Bzikadze and 95 more Alla Mikheenko Mitchell R. Vollger Nicolas Altemose Lev Uralsky Ariel Gershman Sergey Aganezov Savannah J. Hoyt Mark Diekhans Glennis A. Logsdon Michael Alonge Stylianos E. Antonarakis Matthew Borchers Gerard G. Bouffard Shelise Brooks Gina V. Caldas Nae-Chyun Chen Haoyu Cheng Chen-Shan Chin William Chow Leonardo Gomes de Lima Philip C. Dishuck Richard Durbin Tatiana Dvorkina Ian T. Fiddes Giulio Formenti Robert S. Fulton Arkarachai Fungtammasan Erik Garrison Patrick G. S. Grady Tina A. Graves-Lindsay Ira M. Hall Nancy F. Hansen Gabrielle A. Hartley Marina Haukness Kerstin Howe Michael W. Hunkapiller Chirag Jain Miten Jain Erich D. Jarvis Peter Kerpedjiev Melanie Kirsche Mikhail Kolmogorov Jonas Korlach Milinn Kremitzki Heng Li Valerie V. Maduro Tobias Marschall Ann M. Mc Cartney Jennifer McDaniel Danny E. Miller James C. Mullikin Eugene W. Myers Nathan D. Olson Benedict Paten Paul Peluso Pavel A. Pevzner David Porubský Tamara Potapova Е. И. Рогаев Jeffrey Rosenfeld Steven L. Salzberg Valérie Schneider Fritz J. Sedlazeck Kishwar Shafin Colin J. Shew Alaina Shumate Ying Sims Arian F. A. Smit Daniela C. Soto Ivan Sović Jessica M. Storer Aaron Streets Beth A. Sullivan Françoise Thibaud‐Nissen James Torrance Justin Wagner Brian P. Walenz Aaron M. Wenger Jonathan Wood Chunlin Xiao Stephanie M. Yan Alice Young Samantha Zarate Urvashi Surti Rajiv C. McCoy Megan Y. Dennis Ivan A. Alexandrov Jennifer L. Gerton Rachel J. O’Neill Winston Timp Justin M. Zook Michael C. Schatz Evan E. Eichler Karen H. Miga Adam M. Phillippy

Since its initial release in 2000, the human reference genome has covered only euchromatic fraction of genome, leaving important heterochromatic regions unfinished. Addressing remaining 8% Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion–base pair sequence T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors prior references, and introduces nearly 200 million base pairs containing 1956 gene predictions, 99 which are predicted to be...

10.1126/science.abj6987 article EN Science 2022-03-31

Abstract Recent long-read assemblies often exceed the quality and completeness of available reference genomes, making validation challenging. Here we present Merqury, a novel tool for reference-free assembly evaluation based on efficient k-mer set operations. By comparing k-mers in de novo to those found unassembled high-accuracy reads, Merqury estimates base-level accuracy completeness. For trios, can also evaluate haplotype-specific accuracy, completeness, phase block continuity, switch...

10.1186/s13059-020-02134-9 article EN cc-by Genome biology 2020-09-14

A critical output of metagenomic studies is the estimation abundances taxonomical or functional groups. The inherent uncertainty in assignments to these groups makes it important consider both their hierarchical contexts and prediction confidence. current tools for visualizing data, however, omit distort quantitative relationships lack facility displaying secondary variables. Here we present Krona, a new visualization tool that allows intuitive exploration relative confidences within complex...

10.1186/1471-2105-12-385 article EN cc-by BMC Bioinformatics 2011-09-30

The MUMmer system and the genome sequence aligner nucmer included within it are among most widely used alignment packages in genomics. Since last major release of version 3 2004, has been applied to many types problems including aligning whole sequences, reads a reference genome, comparing different assemblies same genome. Despite its broad utility, MUMmer3 limitations that can make difficult use for large genomes very data sets common today. In this paper we describe MUMmer4, substantially...

10.1371/journal.pcbi.1005944 article EN public-domain PLoS Computational Biology 2018-01-26

A human genome is sequenced and assembled de novo using a pocket-sized nanopore device. We report the sequencing assembly of reference for GM12878 Utah/Ceph cell line MinION (Oxford Nanopore Technologies) sequencer. 91.2 Gb sequence data, representing ∼30× theoretical coverage, were produced. Reference-based alignment enabled detection large structural variants epigenetic modifications. De reads alone yielded contiguous (NG50 ∼3 Mb). developed protocol to generate ultra-long (N50 > 100 kb,...

10.1038/nbt.4060 article EN cc-by Nature Biotechnology 2018-01-29

Abstract Whole-genome sequences are now available for many microbial species and clades, however existing whole-genome alignment methods limited in their ability to perform sequence comparisons of multiple simultaneously. Here we present the Harvest suite core-genome visualization tools rapid simultaneous analysis thousands intraspecific strains. includes Parsnp, a fast multi-aligner, Gingr, dynamic visual platform. Together they provide interactive alignments, variant calls, recombination...

10.1186/s13059-014-0524-x article EN cc-by Genome biology 2014-11-19

We describe a suffix-tree algorithm that can align the entire genome sequences of eukaryotic and prokaryotic organisms with minimal use computer time memory. The new system, MUMmer 2, runs three times faster while using one-third as much memory original system. It has been used successfully to human mouse genomes each other, numerous smaller genomes. A module permits alignment multiple DNA sequence fragments, which proven valuable in comparison incomplete sequences. also method more...

10.1093/nar/30.11.2478 article EN Nucleic Acids Research 2002-06-01

The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing update since 2009; it reflects resolution roughly 1000 issues encompasses modifications ranging from thousands single base changes to megabase-scale path reorganizations, gap closures, localization previously orphaned sequences. We developed new approach sequence generation for targeted updates used data mapping technologies haplotype...

10.1101/gr.213611.116 article EN cc-by-nc Genome Research 2017-04-10

New sequencing technology has dramatically altered the landscape of whole-genome sequencing, allowing scientists to initiate numerous projects decode genomes previously unsequenced organisms. The lowest-cost can generate deep coverage most species, including mammals, in just a few days. sequence data generated by one these consist millions or billions short DNA sequences (reads) that range from 50 150 nt length. These must then be assembled de novo before genome analyses begin....

10.1101/gr.131383.111 article EN Genome Research 2011-12-06

The process of generating raw genome sequence data continues to become cheaper, faster, and more accurate. However, assembly such into high-quality, finished sequences remains challenging. Many tools are available, but they differ greatly in terms their performance (speed, scalability, hardware requirements, acceptance newer read technologies) final output (composition assembled sequence). More importantly, it largely unclear how best assess the quality sequences. Assemblathon competitions...

10.1186/2047-217x-2-10 article EN GigaScience 2013-07-22

After two decades of improvements, the current human reference genome (GRCh38) is most accurate and complete vertebrate ever produced. However, no single chromosome has been finished end to end, hundreds unresolved gaps persist1,2. Here we present a assembly that surpasses continuity GRCh382, along with gapless, telomere-to-telomere chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing hydatidiform mole CHM13 genome, combined complementary technologies for...

10.1038/s41586-020-2547-7 article EN cc-by Nature 2020-07-14

Long-read sequencing and novel long-range assays have revolutionized de novo genome assembly by automating the reconstruction of reference-quality genomes. In particular, Hi-C is becoming an economical method for generating chromosome-scale scaffolds. Despite its increasing popularity, there are limited open-source tools available. Errors, particularly inversions fusions across chromosomes, remain higher than alternate scaffolding technologies. We present a scaffolder that does not require...

10.1371/journal.pcbi.1007273 article EN public-domain PLoS Computational Biology 2019-08-21

Complete and accurate genome assemblies form the basis of most downstream genomic analyses are critical importance. Recent assembly projects have relied on a combination noisy long-read sequencing short-read sequencing, with former offering greater continuity latter providing higher consensus accuracy. The recently introduced Pacific Biosciences (PacBio) HiFi technology bridges this divide by delivering long reads (>10 kbp) high per-base accuracy (>99.9%). Here we present HiCanu,...

10.1101/gr.263566.120 article EN cc-by-nc Genome Research 2020-08-14

Abstract The MUMmer sequence alignment package is a suite of computer programs designed to detect regions homology in long biological sequences. Version 2.1 makes several improvements the package, including: increased speed and reduced memory requirements; ability handle both protein DNA sequences; multiple fragments; new algorithms for clustering together basic matches. system particularly efficient at comparing highly similar sequences, such as alternative versions fragment assemblies or...

10.1002/0471250953.bi1003s00 article EN Current Protocols in Bioinformatics 2003-01-01

Major advances in selection progress for cattle have been made following the introduction of genomic tools over past 10-12 years. These depend upon Bos taurus reference genome (UMD3.1.1), which was created using now-outdated technologies and is hindered by a variety deficiencies inaccuracies. We present new cattle, ARS-UCD1.2, based on same animal as original to facilitate transfer interpretation results obtained from earlier version, but applying combination modern de novo assembly increase...

10.1093/gigascience/giaa021 article EN cc-by GigaScience 2020-03-01

Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe Assemblathon 1 competition, which aimed to comprehensively assess state art in methods when applied current technologies. In collaborative effort, teams were asked assemble simulated Illumina HiSeq data set an unknown, diploid A total 41 assemblies from 17 different groups received. Novel haplotype aware...

10.1101/gr.126599.111 article EN cc-by-nc Genome Research 2011-09-16

Female Aedes aegypti mosquitoes infect more than 400 million people each year with dangerous viral pathogens including dengue, yellow fever, Zika and chikungunya. Progress in understanding the biology of developing tools to fight them has been slowed by lack a high-quality genome assembly. Here we combine diverse technologies produce markedly improved, fully re-annotated AaegL5 assembly, demonstrate how it accelerates mosquito science. We anchored physical cytogenetic maps, doubled number...

10.1038/s41586-018-0692-z article EN cc-by Nature 2018-11-14
Coming Soon ...