- Microbial Community Ecology and Physiology
- Bacteriophages and microbial interactions
- Genomics and Phylogenetic Studies
- Marine and coastal ecosystems
- Protist diversity and phylogeny
- Plant Virus Research Studies
- Animal Virus Infections Studies
- RNA and protein synthesis mechanisms
- Marine Biology and Ecology Research
- COVID-19 and healthcare impacts
- Health disparities and outcomes
- Viral gastroenteritis research and epidemiology
- RNA modifications and cancer
- Face and Expression Recognition
- Molecular Biology Techniques and Applications
- Influenza Virus Research Studies
- Geology and Paleoclimatology Research
- Viral Infectious Diseases and Gene Expression in Insects
- Plant Pathogens and Fungal Diseases
- Microbial Fuel Cells and Bioremediation
- Biosimilars and Bioanalytical Methods
- Geochemistry and Elemental Analysis
- Cystic Fibrosis Research Advances
- Dark Matter and Cosmic Phenomena
- Bacterial Genetics and Biotechnology
Clark University
2015-2024
University of Southern California
2015-2019
University of Washington
2003-2015
Massachusetts Institute of Technology
1999-2005
Identifying viral sequences in mixed metagenomes containing both and host contigs is a critical first step analyzing the component of samples. Current tools for distinguishing prokaryotic virus primarily use gene-based similarity approaches. Such approaches can significantly limit results especially short that have few predicted proteins or lack with to previously known viruses. We developed VirFinder, k-mer frequency based, machine learning method contig identification entirely avoids...
Background The recent development of metagenomic sequencing makes it possible to massively sequence microbial genomes including viral without the need for laboratory culture. Existing reference‐based and gene homology‐based methods are not efficient in identifying unknown viruses or short sequences from data. Methods Here we developed a reference‐free alignment‐free machine learning method, DeepVirFinder, data using deep learning. Results Trained based on RefSeq discovered before May 2015,...
Abstract Viruses and their host genomes often share similar oligonucleotide frequency (ONF) patterns, which can be used to predict the of a given virus by finding with greatest ONF similarity. We comprehensively compared 11 metrics using several k-mer lengths for predicting taxonomy from among ∼32 000 prokaryotic 1427 isolate whose true hosts are known. The background-subtracting measure $d_2^*$ at k = 6 gave highest prediction accuracy (33%, genus level) reasonable computational times....
Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity functions and activities
The closely related cyanobacteria Synechococcus and Prochlorococcus have different distributions in stratified water columns the northern Sargasso Sea. abundance of is relatively uniform with depth, but Prochlorococcu s cell numbers are low within shallow mixed layers high below thermocline. Because free cupric ion (free Cu 2+ ) concentrations (up to 6 pM) lower deeper water, there an inverse relationship between densities concentration. We explored possibility a causal underpinning for this...
Marine Synechococcus is a globally significant genus of cyanobacteria that comprised multiple genetic lineages or clades. These clades are thought to represent ecologically distinct units, ecotypes. Because often co-occur together in the oceans, ideal microbes explore how closely related bacterial taxa within same functional guild organisms coexist and partition marine habitats. Here we perform multi-locus sequencing cultured strains confirm congruency clade classifications between 16S-23S...
Metagenomic sequencing has greatly enhanced the discovery of viral genomic sequences; however, it remains challenging to identify host(s) these new viruses. We developed VirHostMatcher-Net, a flexible, network-based, Markov random field framework for predicting virus-prokaryote interactions using multiple, integrated features: CRISPR sequences and alignment-free similarity measures ([Formula: see text] WIsH). Evaluation this method on benchmark set 1462 known pairs yielded host prediction...
Marine microbial communities often contain multiple closely related phylogenetic clades, but in many cases, it is still unclear what physiological traits differentiate these putative ecotypes. The numerically abundant marine cyanobacterium Synechococcus can be divided into at least 14 clades. In order to better understand ecotype differentiation this genus, we assessed the diversity of a community from well-mixed water column Sargasso Sea during March 2002, time year when genus typically...
Summary Prochlorococcus is a marine cyanobacterium which found at high abundances in world's tropical and subtropical oligotrophic oceans. The genus can be divided into two major groups based on light physiology. Both of these further subdivided genetically distinct lineages, or ecotypes. Real‐time polymerase chain reaction (PCR) assays sequence differences the 16S‐23S rDNA internal transcribed spacer 23S were developed to examine distribution each ecotype field. real‐time PCR enabled linear...
Many Proteobacteria possess LuxI-LuxR–type quorum-sensing systems that produce and detect fatty acyl-homoserine lactone (HSL) signals. The photoheterotroph Rhodopseudomonas palustris is unusual in it produces detects an aryl-HSL, p -coumaroyl-HSL, signal production requires exogenous source of -coumarate. A photosynthetic stem-nodulating member the genus Bradyrhizobium a small molecule elicits R. response. Here, we show this cinnamoyl-HSL produced by LuxI homolog BraI detected BraR....
Marine Thaumarchaeota are abundant ammonia-oxidizers but have few representative laboratory-cultured strains. We report the cultivation of Candidatus Nitrosomarinus catalina SPOT01, a novel strain that is less warm-temperature tolerant than other cultivated Thaumarchaeota. Using metagenomic recruitment, SPOT01 comprises major portion (4-54%) in temperate Pacific waters. Its complete 1.36 Mbp genome possesses several distinguishing features: putative phosphorothioation (PT) DNA modification...
The Costa Rica Dome (CRD) is a wind‐driven upwelling feature in the eastern tropical Pacific that supports unusually high concentrations (> 10 6 cells mL −1 ) of picocyanobacteria Prochlorococcus and Synechococcus . To understand what causes this unusual phytoplankton bloom, we conducted comprehensive survey hydrography, picophytoplankton population structure, trace metal chemistry CRD surrounding oligotrophic equatorial waters. Based on size‐fractionated chlorophyll, picoplankton...
Summary Currently defined ecotypes in marine cyanobacteria Prochlorococcus and Synechococcus likely contain subpopulations that themselves are ecologically distinct. We developed applied high‐throughput sequencing for the 16S‐23S rRNA internally transcribed spacer (ITS) to examine ecotype fine‐scale genotypic community dynamics monthly surface water samples spanning 5 years at San Pedro Ocean Time‐series site. Ecotype‐level structure displayed regular seasonal patterns including succession,...
Metagenomics has transformed our understanding of microbial diversity across ecosystems, with recent advances enabling de novo assembly genomes from metagenomes. These metagenome-assembled are critical to provide ecological, evolutionary, and metabolic context for all the microbes viruses yet be cultivated. Metagenomes can now generated nanogram subnanogram amounts DNA. However, these libraries require several rounds PCR amplification before sequencing, data suggest typically yield smaller...
The study of virus-host infectious association is important for understanding the functions and dynamics microbial communities. Both cellular fractionated viral metagenomic data generate a large number contigs with missing host information. Although relative simple methods based on similarity between word frequency vectors viruses bacterial hosts have been developed to associations, problem significantly understudied. We hypothesize that machine learning frequencies can be efficiently used...
Summary Phytoplankton are limited by iron (Fe) in ~40% of the world's oceans including high‐nutrient low‐chlorophyll (HNLC) regions. While low‐Fe adaptation has been well‐studied large eukaryotic diatoms, less is known for small, prokaryotic marine picocyanobacteria. This study reveals key physiological and genomic differences underlying Fe HNLC ecotype CRD1 strains have greater tolerance to low congruent with their expanded repertoire transporter, storage regulatory genes compared other...
Cyanophages exert important top-down controls on their cyanobacteria hosts; however, concurrent analysis of both phage and host populations is needed to better assess phage–host interaction models. We analyzed picocyanobacteria Prochlorococcus Synechococcus T4-like cyanophage communities in Pacific Ocean surface waters using five years monthly viral cellular fraction metagenomes. Cyanophage contained thousands mostly low-abundance (<2% relative abundance) species with varying temporal...
Synechococcus, a genus of unicellular cyanobacteria found throughout the global surface ocean, is large driver Earth's carbon cycle. Developing better understanding its diversity and distributions an ongoing effort in biological oceanography. Here, we introduce 12 new draft genomes marine Synechococcus isolates spanning five clades utilize ~100 environmental metagenomes largely sourced from TARA Oceans project to assess genomic lineages they other reference represent. We show that newly...
In a world of increasingly urbanized environments, it is critical to understand the impact on microbial communities, including fungal as measure ecosystem health, and document how these environments are changing. Aquatic in particular, can be highly sensitive urbanization with removal local habitat inputs wastewater contaminants, leading displacement or extinction natural flora, fauna, funga. especially extremely complex far less characterized compared their macro counterparts. Here, we...
The cyanobacterium Prochlorococcus is the dominant phototroph in surface waters of vast oligotrophic oceans, foundation marine food webs, and an important component global biogeochemical cycles. prominence across environmental gradients open ocean attributed to its extensive genetic diversity flexible chlorophyll physiology, enabling light capture over a wide range intensities. What remains unknown balance between temporal dynamics physiology ability respond variety short (approximately one...
Abstract The cyanobacterium Prochlorococcus is the most abundant photosynthetic cell on Earth and contributes to global ocean carbon cycling food webs. known for its extensive diversity that falls into two groups of ecotypes, low‐light (LL) high‐light (HL) adapted ecotypes. Previous work has shown niche partitioning very HL ecotypes subecotypes across oceanographic gradients including temperature, nutrients, day length. However, within LL not been studied as well because they are less...