- Chemical Synthesis and Analysis
- Computational Drug Discovery Methods
- Antimicrobial Peptides and Activities
- Dendrimers and Hyperbranched Polymers
- Machine Learning in Materials Science
- Glycosylation and Glycoproteins Research
- Carbohydrate Chemistry and Synthesis
- Analytical Chemistry and Chromatography
- Monoclonal and Polyclonal Antibodies Research
- RNA Interference and Gene Delivery
- Click Chemistry and Applications
- Enzyme Catalysis and Immobilization
- Microbial Natural Products and Biosynthesis
- Advanced biosensing and bioanalysis techniques
- Machine Learning in Bioinformatics
- Crystallization and Solubility Studies
- X-ray Diffraction in Crystallography
- Biochemical and Structural Characterization
- Bacteriophages and microbial interactions
- Various Chemistry Research Topics
- Advanced Proteomics Techniques and Applications
- vaccines and immunoinformatics approaches
- Protein Structure and Dynamics
- Molecular spectroscopy and chirality
- Amino Acid Enzymes and Metabolism
University of Bern
2016-2025
IBM Research - Zurich
2020
NCCR Chemical Biology - Visualisation and Control of Biological Processes Using Chemistry
2011-2018
DSM (Netherlands)
2017
University of Copenhagen
2017
Université Joseph Fourier
2007
Centre National de la Recherche Scientifique
2007
Centre Hospitalier Universitaire de Grenoble
1979-2007
Hôpital Albert Michallon
2006-2007
Institute of Catalysis and Petrochemistry
2004
Drug molecules consist of a few tens atoms connected by covalent bonds. How many such are possible in total and what is their structure? This question pressing interest medicinal chemistry to help solve the problems drug potency, selectivity, toxicity reduce attrition rates pointing new molecular series. To better define unknown chemical space, we have enumerated 166.4 billion up 17 C, N, O, S, halogens forming universe database GDB-17, covering size range containing drugs typical for lead...
GDB-13 enumerates small organic molecules containing up to 13 atoms of C, N, O, S, and Cl following simple chemical stability synthetic feasibility rules. With 977 468 314 structures, is the largest publicly available molecule database date.
ConspectusOne of the simplest questions that can be asked about molecular diversity is how many organic molecules are possible in total? To answer this question, my research group has computationally enumerated all up to a certain size gain an unbiased insight into entire chemical space. Our latest database, GDB-17, contains 166.4 billion 17 atoms C, N, O, S, and halogens, by far largest small molecule database reported date. Molecules allowed valency rules but unstable or nonsynthesizable...
All molecules of up to 11 atoms C, N, O, and F possible under consideration simple valency, chemical stability, synthetic feasibility rules were generated collected in a database (GDB). GDB contains 26.4 million (110.9 stereoisomers), including three- four-membered rings triple bonds. By comparison, only 63 857 compounds found public databases (a combination PubChem, ChemACX, ChemSCX, NCI open database, the Merck Index). A total 538 1208 ring systems are currently unknown CAS Registry...
The chemical space is the ensemble of all possible molecules, which believed to contain at least 1060 organic molecules below 500 Da interest for drug discovery. This review summarizes development concept from enumerating acyclic hydrocarbons in 1800's recent assembly universe database GDB. Chemical travel algorithms can be used explore defined regions by generating focused virtual libraries. Maps are produced property spaces visualized principal component analysis or self-organizing maps,...
Recurrent Neural Networks (RNNs) trained with a set of molecules represented as unique (canonical) SMILES strings, have shown the capacity to create large chemical spaces valid and meaningful structures. Herein we perform an extensive benchmark on models subsets GDB-13 different sizes (1 million, 10,000 1000), variants (canonical, randomized DeepSMILES), two recurrent cell types (LSTM GRU) hyperparameter combinations. To guide benchmarks new metrics were developed that define how well model...
Molecular fingerprints are essential cheminformatics tools for virtual screening and mapping chemical space. Among the different types of fingerprints, substructure perform best small molecules such as drugs, while atom-pair preferable large peptides. However, no available fingerprint achieves good performance on both classes molecules.
Abstract The chemical sciences are producing an unprecedented amount of large, high-dimensional data sets containing structures and associated properties. However, there currently no algorithms to visualize such while preserving both global local features with a sufficient level detail allow for human inspection interpretation. Here, we propose solution this problem new visualization method, TMAP, capable representing up millions points arbitrary high dimensionality as two-dimensional tree (...
We present the open-source AiZynthFinder software that can be readily used in retrosynthetic planning. The algorithm is based on a Monte Carlo tree search recursively breaks down molecule to purchasable precursors. guided by an artificial neural network policy suggests possible precursors utilizing library of known reaction templates. fast and typically find solution less than 10 s perform complete 1 min. Moreover, development code was range engineering principles such as automatic testing,...
RXNmapper constructs coherent atom-mapping rules from raw chemical reactions using unsupervised training of neural networks.
Abstract Organic synthesis methodology enables the of complex molecules and materials used in all fields science technology represents a vast body accumulated knowledge optimally suited for deep learning. While most organic reactions involve distinct functional groups can readily be learned by learning models chemists alike, regio- stereoselective transformations are more challenging because their outcome also depends on group surroundings. Here, we challenge Molecular Transformer model to...
Abstract Artificial intelligence is driving one of the most important revolutions in organic chemistry. Multiple platforms, including tools for reaction prediction and synthesis planning based on machine learning, have successfully become part chemists’ daily laboratory, assisting domain-specific synthetic problems. Unlike retrosynthetic models, yields has received less attention spite enormous potential accurately predicting conversion rates. Reaction describing percentage reactants...
Recent applications of recurrent neural networks (RNN) enable training models that sample the chemical space. In this study we train RNN with molecular string representations (SMILES) a subset enumerated database GDB-13 (975 million molecules). We show model trained 1 structures (0.1% database) reproduces 68.9% entire after training, when sampling 2 billion molecules. also developed method to assess quality process using negative log-likelihood plots. Furthermore, use mathematical based on...
Molecular generative models trained with small sets of molecules represented as SMILES strings can generate large regions the chemical space. Unfortunately, due to sequential nature strings, these are not able given a scaffold (i.e., partially-built explicit attachment points). Herein we report new SMILES-based molecular architecture that generates from scaffolds and be any arbitrary set. This approach is possible thanks set pre-processing algorithm exhaustively slices all combinations...
Predicting the nature and outcome of reactions using computational methods is a crucial tool to accelerate chemical research. The recent application deep learning-based learned fingerprints reaction classification yield prediction has shown an impressive increase in performance compared previous such as DFT- structure-based fingerprints. However, require large training data sets, are inherently biased, based on complex learning architectures. Here we present differential fingerprint DRFP....
Machine learning models trained with experimental data for antimicrobial activity and hemolysis are shown to produce new non-hemolytic peptides active against multidrug-resistant bacteria.
The retrosynthetic accessibility score (RAscore) is based on AI driven planning, and useful for rapid scoring of synthetic feasability pre-screening large datasets virtual/generated molecules.
Less is more: The rules of chemical bonding allow simple elements to form a multitude different molecules, the “chemical universe”. This space explored by constructing database all molecules containing up 11 atoms, under constraints for stability and synthetic feasibility. which contains 13.9 million compounds (see graph; coverage property space) can be used identify possible new drug molecules.
The dendritic architecture applied to peptides provides a practical entry into globular macromolecules resembling proteins. A modular design was chosen using divergent synthesis on solid support alternating proteinogenic α-amino acids with branching diamino acids, producing peptide dendrimers molecular weight of 3−5 kDa. Initial studies focused models for hydrolases and produced esterase featuring histidine as the key catalytic residue. Variations amino acid composition led enantioselective...