- Metabolomics and Mass Spectrometry Studies
- Analytical Chemistry and Chromatography
- Mass Spectrometry Techniques and Applications
- Computational Drug Discovery Methods
- Bioinformatics and Genomic Networks
- Isotope Analysis in Ecology
- Advanced Chemical Sensor Technologies
- Ion-surface interactions and analysis
- Optimization and Search Problems
- Complexity and Algorithms in Graphs
- Microbial Community Ecology and Physiology
- Genomics and Phylogenetic Studies
- Advanced Proteomics Techniques and Applications
- Advanced Graph Theory Research
- Integrated Circuits and Semiconductor Failure Analysis
- Machine Learning in Materials Science
- Electron and X-Ray Spectroscopy Techniques
- Species Distribution and Climate Change
- Microbial Natural Products and Biosynthesis
- Infrared Target Detection Methodologies
- Advanced Measurement and Detection Methods
- Legume Nitrogen Fixing Symbiosis
- Machine Learning in Bioinformatics
- RNA modifications and cancer
- DNA Repair Mechanisms
Friedrich Schiller University Jena
2012-2025
Pennsylvania State University
2016
Schiller International University
2013
Humboldt-Universität zu Berlin
1993
Abstract Untargeted metabolomics experiments rely on spectral libraries for structure annotation, but, typically, only a small fraction of spectra can be matched. Previous in silico methods search databases but cannot distinguish between correct and incorrect annotations. Here we introduce the COSMIC workflow that combines database generation annotation with confidence score consisting kernel density P value estimation support vector machine enforced directionality features. On diverse...
Abstract Molecular networking has become a key method used to visualize and annotate the chemical space in non-targeted mass spectrometry-based experiments. However, distinguishing isomeric compounds quantitative interpretation are currently limited. Therefore, we created Feature-based Networking (FBMN) as new analysis Global Natural Products Social (GNPS) infrastructure. FBMN leverages feature detection alignment tools enhance analyses isomer distinction, including from ion-mobility...
Abstract Small molecule machine learning aims to predict chemical, biochemical, or biological properties from molecular structures, with applications such as toxicity prediction, ligand binding, and pharmacokinetics. A recent trend is developing end-to-end models that avoid explicit domain knowledge. These assume no coverage bias in training evaluation data, meaning the data are representative of true distribution. However, applicability rarely considered models. Here, we investigate how...
Abstract Untargeted metabolomics experiments rely on spectral libraries for structure annotation, but these are vastly incomplete; in silico methods search databases cannot distinguish between correct and incorrect annotations. As biological interpretation relies accurate annotations, the ability to assign confidence such annotations is a key outstanding problem. We introduce COSMIC workflow that combines database generation, score consisting of kernel density p-value estimation Support...
Metabolites, small molecules that are involved in cellular reactions, provide a direct functional signature of state. Untargeted metabolomics experiments usually rely on tandem mass spectrometry to identify the thousands compounds biological sample. Recently, we presented CSI:FingerID for searching molecular structure databases using data. predicts fingerprint encodes query compound, then uses this search database such as PubChem. Scoring predicted and deterministic target fingerprints is...
The exchange of metabolites mediates algal and bacterial interactions that maintain ecosystem function. Yet, while thousands are produced, only a few molecules have been identified in these associations. Using the ubiquitous microalgae Pseudo-nitzschia sp., as model, we employed an untargeted metabolomics strategy to assign structural characteristics distinguished specific diatom-microbiome We cultured five species Pseudo-nitzschia, including two produced toxin domoic acid, examined their...
Metabolites provide a direct functional signature of cellular state. Untargeted metabolomics usually relies on mass spectrometry, technology capable detecting thousands compounds in biological sample. Metabolite annotation is executed using tandem spectrometry. Spectral library search far from comprehensive, and numerous remain unannotated. So-called silico methods allow us to overcome the restrictions spectral libraries, by searching much larger molecular structure databases. Yet, after...
Cyanobacterial regulation of gene expression must contend with a genome organization that lacks apparent functional context, as the majority cellular processes and metabolic pathways are encoded by genes found at disparate locations across relatively few transcription factors exist. In this study, global transcript abundance data from model cyanobacterium Synechococcus sp. PCC 7002 grown under 42 different conditions was analyzed using Context-Likelihood Relatedness (CLR). The resulting...
1 Abstract The confident high-throughput identification of small molecules remains one the most challenging tasks in mass spectrometry-based metabolomics. SIRIUS has become a powerful tool for interpretation tandem spectra, and shows outstanding performance identifying molecular formula query compound, being first step structure identification. Nevertheless, both formulas large compounds above 500 Daltons novel highly challenging. Here, we present ZODIAC, network-based algorithm de novo...
ABSTRACT Metabolomics experiments can employ non-targeted tandem mass spectrometry to detect hundreds thousands of molecules in a biological sample. Structural annotation is typically carried out by searching their fragmentation spectra spectral libraries or, recently, structure databases. Annotations are limited structures present the library or database employed, prohibiting thorough utilization experimental data. We computational tool for systematic compound class annotation: CANOPUS uses...
Abstract Small molecule machine learning tries to predict chemical, biochemical or biological properties from the structure of a molecule. Applications include prediction toxicity, ligand binding retention time. A recent trend is develop end-to-end models that avoid explicit integration domain knowledge via inductive bias. central assumption in doing so, there no coverage bias training and evaluation data, meaning these data are representative subset true distribution we want learn. Usually,...
Abstract Untargeted mass spectrometry is employed to detect small molecules in complex biospecimens, generating data that are difficult interpret. We developed Qemistree, a exploration strategy based on hierarchical organization of molecular fingerprints predicted from fragmentation spectra, represented the context sample metadata and chemical ontologies. By expressing relationships as tree, we can apply ecological tools, designed around relatedness DNA sequences, study composition.
We introduce a method for finding characteristic substructure set of molecular structures. Different from common approaches, such as computing the maximum subgraph, resulting does not have to be contained in its exact form all input molecules. Our approach is part identification pipeline unknown metabolites using fragmentation trees. Searching databases tree alignment results hit lists containing compounds with large structural similarity metabolite. The molecules list may key element...
Interpretation of fragmentation mass spectra depends on our knowledge collision-induced dissociation mechanisms. Computational methods for the annotation mechanisms operate within boundaries recognized pathways. The prevalence charge migration (CMF) in sodiated ion spectra, which produces nonsodiated fragment ions, is unknown. Here, we investigated extent CMF precursors by mining NIST17 spectral library using a diagnostic difference. Our results showed that substantial amount ions precursor...
We present a dynamic programming algorithm for optimally solving the Cograph Editing problem on an $n$-vertex graph that runs in $O(3^n n)$ time and uses $O(2^n)$ space. In this problem, we are given $G = (V, E)$ task is to find smallest possible set $F \subseteq V \times V$ of vertex pairs such $(V, E \bigtriangleup F)$ cograph (or $P_4$-free graph), where $\bigtriangleup$ represents symmetric difference operator. also describe technique speeding up performance practice. Additionally,...