Daniel Probst

ORCID: 0000-0003-1737-4407
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Computational Drug Discovery Methods
  • Machine Learning in Materials Science
  • Advanced Combustion Engine Technologies
  • Combustion and flame dynamics
  • Chemical Synthesis and Analysis
  • Machine Learning in Bioinformatics
  • Vehicle emissions and performance
  • Analytical Chemistry and Chromatography
  • Protein Structure and Dynamics
  • Bioinformatics and Genomic Networks
  • Data Visualization and Analytics
  • Biomedical Text Mining and Ontologies
  • Advanced Text Analysis Techniques
  • Microbial Metabolic Engineering and Bioproduction
  • Metabolomics and Mass Spectrometry Studies
  • Microbial Natural Products and Biosynthesis
  • Various Chemistry Research Topics
  • Turbomachinery Performance and Optimization
  • Biodiesel Production and Applications
  • Refrigeration and Air Conditioning Technologies
  • Scientific Computing and Data Management
  • RNA and protein synthesis mechanisms
  • Genomics and Phylogenetic Studies
  • Topological and Geometric Data Analysis
  • Amino Acid Enzymes and Metabolism

Wageningen University & Research
2025

IBM Research - Zurich
2019-2024

École Polytechnique Fédérale de Lausanne
2022-2024

Convergent Science (United States)
2014-2024

Laboratoire d'Informatique Fondamentale de Lille
2023

Signal Processing (United States)
2023

University of Vienna
2023

University of Bern
1970-2022

Convergence
2018-2019

Argonne National Laboratory
2018

Molecular fingerprints are essential cheminformatics tools for virtual screening and mapping chemical space. Among the different types of fingerprints, substructure perform best small molecules such as drugs, while atom-pair preferable large peptides. However, no available fingerprint achieves good performance on both classes molecules.

10.1186/s13321-020-00445-4 article EN cc-by Journal of Cheminformatics 2020-06-12

Abstract The chemical sciences are producing an unprecedented amount of large, high-dimensional data sets containing structures and associated properties. However, there currently no algorithms to visualize such while preserving both global local features with a sufficient level detail allow for human inspection interpretation. Here, we propose solution this problem new visualization method, TMAP, capable representing up millions points arbitrary high dimensionality as two-dimensional tree (...

10.1186/s13321-020-0416-x article EN cc-by Journal of Cheminformatics 2020-02-12

Predicting the nature and outcome of reactions using computational methods is a crucial tool to accelerate chemical research. The recent application deep learning-based learned fingerprints reaction classification yield prediction has shown an impressive increase in performance compared previous such as DFT- structure-based fingerprints. However, require large training data sets, are inherently biased, based on complex learning architectures. Here we present differential fingerprint DRFP....

10.1039/d1dd00006c article EN cc-by Digital Discovery 2022-01-01

Abstract Enzyme catalysts are an integral part of green chemistry strategies towards a more sustainable and resource-efficient chemical synthesis. However, the use biocatalysed reactions in retrosynthetic planning clashes with difficulties predicting enzymatic activity on unreported substrates enzyme-specific stereo- regioselectivity. As now, only rule-based systems support using biocatalysis, while initial data-driven approaches limited to forward predictions. Here, we extend reaction as...

10.1038/s41467-022-28536-w article EN cc-by Nature Communications 2022-02-18

Among the various molecular fingerprints available to describe small organic molecules, extended connectivity fingerprint, up four bonds (ECFP4) performs best in benchmarking drug analog recovery studies as it encodes substructures with a high level of detail. Unfortunately, ECFP4 requires dimensional representations (≥ 1024D) perform well, resulting nearest neighbor searches very large databases such GDB, PubChem or ZINC slowly due curse dimensionality.Herein we report new called MinHash...

10.1186/s13321-018-0321-8 article EN cc-by Journal of Cheminformatics 2018-12-01

Here we present SmilesDrawer, a dependency-free JavaScript component capable of both parsing and drawing SMILES-encoded molecular structures client-side, developed to be easily integrated into web projects display organic molecules in large numbers fast succession. SmilesDrawer can draw structurally stereochemically complex such as maitotoxin C60 without using templates, yet has an exceptionally small computational footprint low memory usage the requirement for loading images or any other...

10.1021/acs.jcim.7b00425 article EN Journal of Chemical Information and Modeling 2017-12-19

During the past decade, big data have become a major tool in scientific endeavors. Although statistical methods and algorithms are well-suited for analyzing summarizing enormous amounts of data, results do not allow visual inspection entire data. Current software, including R packages Python libraries such as ggplot2, matplotlib plot.ly, support interactive visualizations datasets exceeding 100 000 points on web. Other solutions enable web-based visualization only through reduction or...

10.1093/bioinformatics/btx760 article EN Bioinformatics 2017-11-23

Organic reactions are usually assigned to classes grouping with similar reagents and mechanisms. Reaction facilitate communication of complex concepts efficient navigation through chemical reaction space. However, the classification process is a tedious task, requiring identification corresponding class template via annotation number molecules in reactions, center distinction between reactants reagents. In this work, we show that transformer-based models can infer from non-annotated, simple...

10.26434/chemrxiv.9897365.v3 preprint EN cc-by-nc-nd 2020-08-07

<div class="section abstract"><div class="htmlview paragraph">A computational fluid dynamics (CFD) guided combustion system optimization was conducted for a heavy-duty diesel engine running with gasoline fuel that has research octane number (RON) of 80. The goal to optimize the compression ignition (GCI) recipe (piston bowl geometry, injector spray pattern, in-cylinder swirl motion, and thermal boundary conditions) improved efficiency while maintaining engine-out...

10.4271/2019-01-0001 article EN SAE International Journal of Advances and Current Practices in Mobility 2019-01-15

<p>The chemical sciences are producing an unprecedented amount of large, high-dimensional data sets containing structures and associated properties. However, there currently no algorithms to visualize such while preserving both global local features with a sufficient level detail allow for human inspection interpretation. Here, we propose solution this problem new visualization method, TMAP, capable representing up millions points arbitrary high dimensionality as two-dimensional tree...

10.26434/chemrxiv.9698861 preprint EN cc-by-nc-nd 2019-08-21

In recent years, the use of machine learning-based surrogate models for computational fluid dynamics (CFD) simulations has emerged as a promising technique reducing cost associated with engine design optimization. However, such methods still suffer from drawbacks. One main disadvantage is that default learning (ML) hyperparameters are often severely suboptimal given problem. This been addressed by manually trying out different hyperparameter settings, but this solution ineffective in case...

10.1177/14680874211023466 article EN International Journal of Engine Research 2021-07-14

Recent advances in language modeling have had a tremendous impact on how we handle sequential data science. Language architectures emerged as hotbed of innovation and creativity natural processing over the last decade, since gained prominence proteins chemical processes, elucidating structural relationships from textual/sequential data. Surprisingly, some these refer to three-dimensional features, raising important questions dimensionality information encoded within Here, demonstrate that...

10.1016/j.csbj.2024.04.012 article EN cc-by Computational and Structural Biotechnology Journal 2024-04-30

Members of the classical transient receptor potential protein (TRPC) family are considered as key components phospholipase C (PLC)-dependent Ca2+ signaling. Previous results obtained in HEK 293 expression system suggested a physical and functional coupling TRPC3 to cardiac-type Na+/Ca2+ exchanger, NCX1 (sodium calcium exchanger 1). This study was designed test for (transient channel 3) existence native TRPC3/NCX1 signaling complex rat cardiac myocytes.Protein cellular distribution were...

10.1016/j.cardiores.2006.10.016 article EN Cardiovascular Research 2006-10-27

Herein we report the discovery of antimicrobial bridged bicyclic peptides (AMBPs) active against Pseudomonas aeruginosa, a highly problematic Gram negative bacterium in hospital environment. Two these AMBPs show strong biofilm inhibition and dispersal activity enhance polymyxin, currently last resort antibiotic which resistance is emerging. To discover our used concept chemical space, well known area small molecule drug discovery, to define number test compounds for synthesis experimental...

10.1039/c7sc01314k article EN cc-by-nc Chemical Science 2017-01-01

The recent general availability of low-cost virtual reality headsets and accompanying three-dimensional (3D) engine support presents an opportunity to bring the concept chemical space into environments. While applications represent a category widespread tools in other fields, their use visualization exploration abstract data such as spaces has been experimental. In our previous work, we established interactive two-dimensional (2D) maps followed by web-based 3D visualizations, culminating...

10.1021/acs.jcim.8b00402 article EN Journal of Chemical Information and Modeling 2018-08-16

Abstract Seven million of the currently 94 entries in PubChem database break at least one four Lipinski constraints for oral bioavailability, 183,185 which are also found ChEMBL database. These non‐Lipinski (NLP) and (NLC) subsets interesting because they contain new modalities that can display biological properties not accessible to small molecule drugs. Unfortunately, current search tools designed molecules well suited explore these subsets, therefore remain poorly appreciated. Herein we...

10.1002/minf.201900016 article EN Molecular Informatics 2019-03-07

<p><b>Background</b>: Molecular fingerprints are essential cheminformatics tools for virtual screening and mapping chemical space. Among the different types of fingerprints, substructure perform best small molecules such as drugs, while atom-pair preferable large peptides. However, no available fingerprint achieves good performance on both classes molecules.</p> <p><b>Results</b>: Here we set out to design a new suitable by combining concepts. Our...

10.26434/chemrxiv.11994630 preprint EN cc-by-nc-nd 2020-03-19

We demonstrate and discuss the feasibility of autonomous first-principles mechanistic explorations for providing quantum chemical data to enhance confidence data-driven retrosynthetic synthesis design based on molecular transformers.

10.1039/d3dd00006k article EN cc-by Digital Discovery 2023-01-01

Enzymatic reactions are an ecofriendly, selective, and versatile addition, sometimes even alternative to organic for the synthesis of chemical compounds such as pharmaceuticals or fine chemicals. To identify suitable reactions, computational models predict activity enzymes on non-native substrates, perform retrosynthetic pathway searches, outcomes including regio- stereoselectivity becoming increasingly important. However, current approaches substantially hindered by limited amount available...

10.1039/d3sc02048g article EN cc-by Chemical Science 2023-01-01
Coming Soon ...