- Genomics and Phylogenetic Studies
- Bacterial Genetics and Biotechnology
- RNA and protein synthesis mechanisms
- Machine Learning in Bioinformatics
- Microbial Metabolic Engineering and Bioproduction
- Evolution and Genetic Dynamics
- Streptococcal Infections and Treatments
- Bacterial biofilms and quorum sensing
- Antimicrobial Resistance in Staphylococcus
- Plant Pathogenic Bacteria Studies
- Antibiotic Resistance in Bacteria
- Mitochondrial Function and Pathology
- Tuberculosis Research and Epidemiology
- Legume Nitrogen Fixing Symbiosis
- Plant-Microbe Interactions and Immunity
- Photosynthetic Processes and Mechanisms
- Plant Stress Responses and Tolerance
- Salmonella and Campylobacter epidemiology
- Bacteriophages and microbial interactions
- Toxin Mechanisms and Immunotoxins
- Paraquat toxicity studies and treatments
- Gene expression and cancer classification
- Gene Regulatory Network Analysis
- Pluripotent Stem Cells Research
- Blind Source Separation Techniques
University of California, San Diego
2019-2024
La Jolla Bioengineering Institute
2020-2024
Jacobs (United States)
2019
University of Michigan
2018
Independent component analysis (ICA) of bacterial transcriptomes has emerged as a powerful tool for obtaining co-regulated, independently-modulated gene sets (iModulons), inferring their activities across range conditions, and enabling association to known genetic regulators. By grouping analyzing genes based on observations from big data alone, iModulons can provide novel perspective into how the composition transcriptome adapts environmental conditions. Here, we present iModulonDB...
Abstract The transcriptional regulatory network (TRN) of Bacillus subtilis coordinates cellular functions fundamental interest, including metabolism, biofilm formation, and sporulation. Here, we use unsupervised machine learning to modularize the transcriptome quantitatively describe activity under diverse conditions, creating an unbiased summary gene expression. We obtain 83 independently modulated sets that explain most variance in expression demonstrate 76% them represent effects known...
Bacterial gene expression is orchestrated by numerous transcription factors (TFs). Elucidating how regulated fundamental to understanding bacterial physiology and engineering it for practical use. In this study, a machine-learning approach was applied uncover the genome-scale transcriptional regulatory network (TRN) in Pseudomonas putida KT2440, an important organism bioproduction. We performed independent component analysis of compendium 321 high-quality profiles, which were previously...
Transcriptomic data is accumulating rapidly; thus, scalable methods for extracting knowledge from this are critical. Here, we assembled a top-down expression and regulation base Escherichia coli. The component 1035-sample, high-quality RNA-seq compendium consisting of generated in our lab using single experimental protocol. contains diverse growth conditions, including: 9 media; 39 supplements, including antibiotics; 42 heterologous proteins; 76 gene knockouts. Using resource, elucidated...
Mycobacterium tuberculosis is one of the most consequential human bacterial pathogens, posing a serious challenge to 21st century medicine. A key feature its pathogenicity ability adapt transcriptional response environmental stresses through regulatory network (TRN). While many studies have sought characterize specific portions M. TRN, and some performed system-level analysis, few been able provide network-based model TRN that also provides relative shifts in regulator activity triggered by...
Abstract We are firmly in the era of biological big data. Millions omics datasets publicly accessible and can be employed to support scientific research or build a holistic view an organism. Here, we introduce workflow that converts all public gene expression data for microbe into dynamic representation organism’s transcriptional regulatory network. This five-step process walks researchers through mining, processing, curation, analysis, characterization available data, using Bacillus...
Abstract The transcriptional regulatory network (TRN) of Pseudomonas aeruginosa coordinates cellular processes in response to stimuli. We used 364 transcriptomes (281 publicly available + 83 in-house generated) reconstruct the TRN P. using independent component analysis. identified 104 independently modulated sets genes (iModulons) among which 81 reflect effects known regulators. iModulons that (i) play an important role defining genomic boundaries biosynthetic gene clusters (BGCs), (ii)...
Abstract It has proved challenging to quantitatively relate the proteome transcriptome on a per-gene basis. Recent advances in data analytics have enabled biologically meaningful modularization of bacterial transcriptome. We thus investigate whether matched datasets transcriptomes and proteomes from bacteria under diverse conditions can be modularized same way reveal novel relationships between their compositions. find that; (1) modules are comprised similar list gene products, (2) often...
Abstract Pseudomonas aeruginosa is an opportunistic pathogen and major cause of hospital-acquired infections. The virulence P. largely determined by its transcriptional regulatory network (TRN). We used 411 transcription profiles from diverse growth conditions to construct a quantitative TRN identifying independently modulated sets genes (called iModulons) their condition-specific activity levels. current study focused on the use iModulons analyze biofilm production antibiotic resistance...
Vibrio natriegens regulates natural competence through the TfoX and QstR transcription factors, which are involved in external DNA capture transport. However, extensive genetic transcriptional regulatory basis for competency remains unknown. We used a machine-learning approach to decompose natriegens's transcriptome into 45 groups of independently modulated sets genes (iModulons). Our findings show that is associated with repression two housekeeping iModulons (iron metabolism translation)...
Relationships between the genome, transcriptome, and metabolome underlie all evolved phenotypes. However, it has proved difficult to elucidate these relationships because of high number variables measured. A recently developed data analytic method for characterizing transcriptome can simplify interpretation by grouping genes into independently modulated sets (iModulons). Here, we demonstrate how iModulons reveal deep understanding effects causal mutations metabolic rewiring. We use adaptive...
ABSTRACT Fast growth phenotypes are achieved through optimal transcriptomic allocation, in which cells must balance tradeoffs resource allocation between diverse functions. One such stress readiness and unbridled E. coli has been termed the fear versus greed (f/g) tradeoff. Two specific RNA polymerase (RNAP) mutations observed adaptation to fast have previously shown affect f/g tradeoff, suggesting that genetic adaptations may be primed control allocation. Here, we conduct a greatly expanded...
Public gene expression databases are a rapidly expanding resource of organism responses to diverse perturbations, presenting both an opportunity and challenge for bioinformatics workflows extract actionable knowledge transcription regulatory network function. Here, we introduce five-step computational pipeline, called iModulonMiner, compile, process, curate, analyze, characterize the totality RNA-seq data given or cell type. This workflow is centered around data-driven computation...
Dynamic cellular responses to environmental constraints are coordinated by the transcriptional regulatory network (TRN), which modulates gene expression. This controls most fundamental responses, including metabolism, motility, and stress responses. Here, we apply independent component analysis, an unsupervised machine learning approach, 95 high-quality Sulfolobus acidocaldarius RNA-seq datasets extract 45 independently modulated sets, or iModulons. Together, these iModulons contain 755...
Establishing transcriptional regulatory networks (TRNs) in bacteria has been limited to well-characterized model strains. Using machine learning methods, we established the of six Salmonella enterica serovar Typhimurium strains from their transcriptomes. By decomposing a compendia RNA sequencing (RNA-seq) data with independent component analysis, obtained 400 independently modulated sets genes, called iModulons. We (i) performed pan-genome analysis phylogroup structure S. and analyzed...
can cause a wide variety of acute infections throughout the body its human host. An underlying transcriptional regulatory network (TRN) is responsible for altering physiological state bacterium to adapt each unique host environment. Consequently, an in-depth understanding comprehensive dynamics
Summary Transcriptomic data is accumulating rapidly; thus, development of scalable methods for extracting knowledge from this critical. We assembled a top-down transcriptional regulatory network Escherichia coli 1035-sample, single-protocol, high-quality RNA-seq compendium. The compendium contains diverse growth conditions, including: 4 temperatures; 9 media; 39 supplements, including antibiotics; and 76 unique gene knockouts. Using unsupervised machine learning, we extracted 117 modules...
The rise in hospital outbreaks of multidrug-resistant Acinetobacter baumannii infections underscores the urgent need for alternatives to traditional broad-spectrum antibiotic therapies. success A. as a significant nosocomial pathogen is largely attributed its ability resist antibiotics and survive environmental stressors. However, there limited literature available on global, complex regulatory circuitry that shapes these phenotypes. Computational tools can assist elucidation ’s...
Type I diabetes mellitus, which affects an estimated 1.5 million Americans, is caused by autoimmune destruction of the pancreatic β-cells that results in need for life-long insulin therapy. Allogeneic islet transplantation treatment type a therapy donor islets are infused intrahepatically, has led to transient reversal diabetes. However, therapeutic limitations allogeneic transplantation, include shortage islets, long-term immunosuppression, and high risk tissue rejection, have investigation...
Fit phenotypes are achieved through optimal transcriptomic allocation. Here, we performed a high-resolution, multi-scale study of the tradeoff between two key fitness phenotypes, stress response (fear) and growth (greed), in
There is growing interest in engineering
is responsible for a range of diseases in humans contributing significantly to morbidity and mortality. Among more than 200 serotypes
iModulons-sets of co-expressed genes identified through independent component analysis (ICA) high-quality transcriptomic datasets-provide an unbiased, modular view organism's transcriptional regulatory network. Established in 2020, iModulonDB (iModulonDB.org) serves as a centralized repository curated iModulon sets, enabling users to explore iModulons and download the associated data. This update reflects significant expansion database-19 new ICA decompositions (+633%) spanning 8 925...
Abstract Bacillus subtilis is a well-characterized microorganism and model for the study of Gram-positive bacteria. The bacterium can produce proteins at high densities yields, which has made it valuable industrial bioproduction. Like other cell factories, metabolic modeling B. discovered ways to optimize its metabolism toward various applications. first genome-scale (M-model) was published more than decade ago been applied extensively understand metabolism, predict growth phenotypes, served...