NFDI4DS | UHH-SEMS - Publication Details

Constrained Bayesian optimization for automatic chemical design using variational autoencoders

OPENALEX - Publications

Ryan‐Rhys Griffiths José Miguel Hernández-Lobato

Automatic Chemical Design is a framework for generating novel molecules with optimized properties.

10.1039/c9sc04026a article EN cc-by Chemical Science 2019-11-18

Mathematical Capabilities of ChatGPT

OPENALEX - Publications

Simon Frieder Luca Pinchetti Ryan‐Rhys Griffiths Tommaso Salvatori Thomas Lukasiewicz and 3 more

We investigate the mathematical capabilities of two iterations ChatGPT (released 9-January-2023 and 30-January-2023) GPT-4 by testing them on publicly available datasets, as well hand-crafted ones, using a novel methodology. In contrast to formal mathematics, where large databases proofs are (e.g., Lean Mathematical Library), current datasets natural-language used benchmark language models, either cover only elementary mathematics or very small. address this releasing new datasets: GHOSTS...

10.48550/arxiv.2301.13867 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Mapping Materials and Molecules

OPENALEX - Publications

Bingqing Cheng Ryan‐Rhys Griffiths Simon Wengert Christian Künkel Tamás K. Stenczel and 6 more

ConspectusThe visualization of data is indispensable in scientific research, from the early stages when human insight forms to final step communicating results. In computational physics, chemistry and materials science, it can be as simple making a scatter plot or straightforward looking through snapshots atomic positions manually. However, result "big data" revolution, these conventional approaches are often inadequate. The widespread adoption high-throughput computation for discovery...

10.1021/acs.accounts.0c00403 article EN Accounts of Chemical Research 2020-08-14

HEBO: An Empirical Study of Assumptions in Bayesian Optimisation

OPENALEX - Publications

Alexander I. Cowen-Rivers Wenlong Lyu Rasul Tutunov Zhi Wang Antoine Grosnit and 6 more

In this work we rigorously analyse assumptions inherent to black-box optimisation hyper-parameter tuning tasks. Our results on the Bayesmark benchmark indicate that heteroscedasticity and non-stationarity pose significant challenges for optimisers. Based these findings, propose a Heteroscedastic Evolutionary Bayesian Optimisation solver (HEBO). HEBO performs non-linear input output warping, admits exact marginal log-likelihood is robust values of learned parameters. We demonstrate HEBO’s...

10.1613/jair.1.13643 article EN cc-by Journal of Artificial Intelligence Research 2022-07-11

Constrained Bayesian Optimization for Automatic Chemical Design

OPENALEX - Publications

Ryan‐Rhys Griffiths José Miguel Hernández-Lobato

Automatic Chemical Design is a framework for generating novel molecules with optimized properties. The original scheme, featuring Bayesian optimization over the latent space of variational autoencoder, suffers from pathology that it tends to produce invalid molecular structures. First, we demonstrate empirically this arises when scheme queries points far away data on which autoencoder has been trained. Secondly, by reformulating search procedure as constrained problem, show effects can be...

10.48550/arxiv.1709.05501 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Data-driven discovery of molecular photoswitches with multioutput Gaussian processes

OPENALEX - Publications

Ryan‐Rhys Griffiths Jake L. Greenfield Aditya R. Thawani Arian R. Jamasb Henry B. Moss and 6 more

We present a data-driven discovery pipeline for molecular photoswitches through multitask learning with Gaussian processes. Through subsequent screening, we identify several motifs separated and red-shifted electronic absorption bands.

10.1039/d2sc04306h article EN cc-by Chemical Science 2022-01-01

High-Dimensional Bayesian Optimisation with Variational Autoencoders and Deep Metric Learning

OPENALEX - Publications

Antoine Grosnit Rasul Tutunov Alexandre Max Maraval Ryan‐Rhys Griffiths Alexander I. Cowen-Rivers and 7 more

We introduce a method combining variational autoencoders (VAEs) and deep metric learning to perform Bayesian optimisation (BO) over high-dimensional structured input spaces. By adapting ideas from learning, we use label guidance the blackbox function structure VAE latent space, facilitating Gaussian process fit yielding improved BO performance. Importantly for problem settings, our operates in semi-supervised regimes where only few labelled data points are available. run experiments on three...

10.48550/arxiv.2106.03609 preprint EN other-oa arXiv (Cornell University) 2021-01-01

GAUCHE: A Library for Gaussian Processes in Chemistry

OPENALEX - Publications

Ryan‐Rhys Griffiths Leo Klarner Henry B. Moss Aditya Ravuri Sang Truong and 13 more

We introduce GAUCHE, a library for GAUssian processes in CHEmistry. Gaussian have long been cornerstone of probabilistic machine learning, affording particular advantages uncertainty quantification and Bayesian optimisation. Extending to chemical representations, however, is nontrivial, necessitating kernels defined over structured inputs such as graphs, strings bit vectors. By defining we seek open the door powerful tools optimisation chemistry. Motivated by scenarios frequently encountered...

10.48550/arxiv.2212.04450 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Generative model‐enhanced human motion prediction

OPENALEX - Publications

Anthony Bourached Ryan‐Rhys Griffiths Robert Gray Ashwani Jha Parashkev Nachev

The task of predicting human motion is complicated by the natural heterogeneity and compositionality actions, necessitating robustness to distributional shifts as far out-of-distribution (OoD). Here, we formulate a new OoD benchmark based on Human3.6M Carnegie Mellon University (CMU) capture datasets, introduce hybrid framework for hardening discriminative architectures failure augmenting them with generative model. When applied current state-of-the-art models, show that proposed approach...

10.1002/ail2.63 article EN Applied AI Letters 2022-01-17

Achieving robustness to aleatoric uncertainty with heteroscedastic Bayesian optimisation

OPENALEX - Publications

Ryan‐Rhys Griffiths Alexander A. Aldrick Miguel García-Ortegón Vidhi Lalchand Alpha A. Lee

Bayesian optimisation is a sample-efficient search methodology that holds great promise for accelerating drug and materials discovery programs. A frequently-overlooked modelling consideration in strategies however, the representation of heteroscedastic aleatoric uncertainty. In many practical applications it desirable to identify inputs with low noise, an example which might be material composition consistently displays robust properties response noisy fabrication process. this paper, we...

10.1088/2632-2153/ac298c article EN cc-by Machine Learning Science and Technology 2021-09-23

Identification of the variable X-ray sources GX 339-4 and MXB 1659-29 by the scanning modulation collimator on HEAO 1

OPENALEX - Publications

R. Doxsey H. Bradt M. Johnston Ryan‐Rhys Griffiths Robert W. Leach and 3 more

Precise celestial positions have been obtained with the HEAO 1 scanning modulation collimators for highly variable X-ray source GX 339--4 (4U 1658--48) and burst MXB 1659--29. Both sources are identified faint (17-18 mag) blue objects He II lambda4686 lambdalambda4640--50 emission.

10.1086/182905 article EN The Astrophysical Journal 1979-03-01

Gaussian Process Molecule Property Prediction with FlowMO

OPENALEX - Publications

Henry B. Moss Ryan‐Rhys Griffiths

We present FlowMO: an open-source Python library for molecular property prediction with Gaussian Processes. Built upon GPflow and RDKit, FlowMO enables the user to make predictions well-calibrated uncertainty estimates, output central active learning design applications. Processes are particularly attractive modelling small datasets, a characteristic of many real-world virtual screening campaigns where high-quality experimental data is scarce. Computational experiments across three datasets...

10.48550/arxiv.2010.01118 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Modeling the Multiwavelength Variability of Mrk 335 Using Gaussian Processes

OPENALEX - Publications

Ryan‐Rhys Griffiths Jiachen Jiang D. J. K. Buisson D. R. Wilkins Luigi Gallo and 10 more

Abstract The optical and UV variability of the majority active galactic nuclei may be related to reprocessing rapidly changing X-ray emission from a more compact region near central black hole. Such model would characterized by lags between optical/UV due differences in light travel time. Observationally, however, such lag features have been difficult detect gaps lightcurves introduced through factors as source visibility or limited telescope In this work, Gaussian process regression is...

10.3847/1538-4357/abfa9f article EN The Astrophysical Journal 2021-06-01

Bayesian optimisation for additive screening and yield improvements – beyond one-hot encoding

OPENALEX - Publications

Bojana Ranković Ryan‐Rhys Griffiths Henry B. Moss Philippe Schwaller

Cost-effective Bayesian optimisation screening of 720 additives on four complex reactions, achieving substantial yield improvements over baselines using chemical reaction representations beyond one-hot encoding.

10.1039/d3dd00096f article EN cc-by Digital Discovery 2023-11-02

HEBO Pushing The Limits of Sample-Efficient Hyperparameter Optimisation

OPENALEX - Publications

Alexander I. Cowen-Rivers Wenlong Lyu Rasul Tutunov Zhi Wang Antoine Grosnit and 6 more

In this work we rigorously analyse assumptions inherent to black-box optimisation hyper-parameter tuning tasks. Our results on the Bayesmark benchmark indicate that heteroscedasticity and non-stationarity pose significant challenges for optimisers. Based these findings, propose a Heteroscedastic Evolutionary Bayesian Optimisation solver (HEBO). HEBO performs non-linear input output warping, admits exact marginal log-likelihood is robust values of learned parameters. We demonstrate HEBO's...

10.48550/arxiv.2012.03826 preprint EN cc-by arXiv (Cornell University) 2020-01-01

The Photoswitch Dataset: A Molecular Machine Learning Benchmark for the Advancement of Synthetic Chemistry

OPENALEX - Publications

Aditya R. Thawani Ryan‐Rhys Griffiths Arian R. Jamasb Anthony Bourached Penelope Jones and 3 more

The space of synthesizable molecules is greater than $10^{60}$, meaning only a vanishingly small fraction these have ever been realized in the lab. In order to prioritize which regions this explore next, synthetic chemists need access accurate molecular property predictions. While great advances machine learning made, there dearth benchmarks featuring properties that are useful for chemist. Focussing directly on needs chemist, we introduce Photoswitch Dataset, new benchmark where...

10.26434/chemrxiv.12609899 preprint EN cc-by-nc-nd 2020-07-06

Analyzing global utilization and missed opportunities in debt-for-nature swaps with generative AI

OPENALEX - Publications

Nataliya Tkachenko Simon Frieder Ryan‐Rhys Griffiths Christoph Nedopil

We deploy a prompt-augmented GPT-4 model to distill comprehensive datasets on the global application of debt-for-nature swaps (DNS), pivotal financial tool for environmental conservation. Our analysis includes 195 nations and identifies 21 countries that have not yet used DNS before as prime candidates DNS. A significant proportion demonstrates consistent commitments conservation finance (0.86 accuracy compared historical records). Conversely, 35 previously active in 2010 since been...

10.3389/frai.2024.1167137 article EN cc-by Frontiers in Artificial Intelligence 2024-02-05

Are we Forgetting about Compositional Optimisers in Bayesian Optimisation?

OPENALEX - Publications

Antoine Grosnit Alexander I. Cowen-Rivers Rasul Tutunov Ryan‐Rhys Griffiths Jun Wang and 1 more

Bayesian optimisation presents a sample-efficient methodology for global optimisation. Within this framework, crucial performance-determining subroutine is the maximisation of acquisition function, task complicated by fact that functions tend to be non-convex and thus nontrivial optimise. In paper, we undertake comprehensive empirical study approaches maximise function. Additionally, deriving novel, yet mathematically equivalent, compositional forms popular functions, recast as problem,...

10.48550/arxiv.2012.08240 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Dataset Bias in the Natural Sciences: A Case Study in Chemical Reaction Prediction and Synthesis Design

OPENALEX - Publications

Ryan‐Rhys Griffiths Philippe Schwaller Alpha A. Lee

Datasets in the Natural Sciences are often curated with goal of aiding scientific understanding and hence may not always be a form that facilitates application machine learning. In this paper, we identify three trends within fields chemical reaction prediction synthesis design require change direction. First, manner which datasets split into reactants reagents encourages testing models an unrealistically generous manner. Second, highlight prevalence mislabelled data, suggest focus should on...

10.26434/chemrxiv.7366973.v1 preprint EN 2018-11-21

Bayesian optimisation for additive screening and yield improvements in chemical reactions – beyond one-hot encoding

OPENALEX - Publications

Bojana Ranković Ryan‐Rhys Griffiths Henry B. Moss Philippe Schwaller

Reaction additives play a significant role in controlling the reactivity and outcomes of chemical reactions. For example, recent high-throughput additive screening identified phthalimide ligand for Ni-catalysed photoredox decarboxylative arylations. This discovery enabled 4-fold yield improvement by stabilising oxidative addition complexes breaking up deactivated catalyst aggregates. Despite promise such large-scale screenings, they remain inaccessible to most research groups due their cost...

10.26434/chemrxiv-2022-nll2j-v3 preprint EN cc-by-nc 2023-06-15

The Photoswitch Dataset: A Molecular Machine Learning Benchmark for the Advancement of Synthetic Chemistry

OPENALEX - Publications

Aditya R. Thawani Ryan‐Rhys Griffiths Arian R. Jamasb Anthony Bourached Penelope Jones and 3 more

The space of synthesizable molecules is greater than $10^{60}$, meaning only a vanishingly small fraction these have ever been realized in the lab. In order to prioritize which regions this explore next, synthetic chemists need access accurate molecular property predictions. While great advances machine learning made, there dearth benchmarks featuring properties that are useful for chemist. Focussing directly on needs chemist, we introduce Photoswitch Dataset, new benchmark where...

10.26434/chemrxiv.12609899.v1 preprint EN 2020-07-06

Applications of Gaussian Processes at Extreme Lengthscales: From Molecules to Black Holes

OPENALEX - Publications

Ryan‐Rhys Griffiths

In many areas of the observational and experimental sciences data is scarce. Data observation in high-energy astrophysics disrupted by celestial occlusions limited telescope time while derived from laboratory experiments synthetic chemistry materials science cost-intensive to collect. On other hand, knowledge about data-generation mechanism often available sciences, such as measurement error a piece apparatus. Both characteristics, small underlying physics, make Gaussian processes (GPs)...

10.48550/arxiv.2303.14291 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Adaptive Sensor Placement for Continuous Spaces

OPENALEX - Publications

James A. Grant Alexis Boukouvalas Ryan‐Rhys Griffiths David S. Leslie Sattar Vakili and 1 more

We consider the problem of adaptively placing sensors along an interval to detect stochastically-generated events. present a new formulation as continuum-armed bandit with feedback in form partial observations realisations inhomogeneous Poisson process. design solution method by combining Thompson sampling nonparametric inference via increasingly granular Bayesian histograms and derive $\tilde{O}(T^{2/3})$ bound on regret $T$ rounds. This is coupled efficent optimisation approach select...

10.48550/arxiv.1905.06821 preprint EN other-oa arXiv (Cornell University) 2019-01-01