Steven Kearnes

ORCID: 0000-0003-4579-4388
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Computational Drug Discovery Methods
  • Machine Learning in Materials Science
  • Protein Structure and Dynamics
  • Chemical Synthesis and Analysis
  • Scientific Computing and Data Management
  • Innovative Microfluidic and Catalytic Techniques Innovation
  • Cell Image Analysis Techniques
  • HIV Research and Treatment
  • Chemistry and Chemical Engineering
  • RNA and protein synthesis mechanisms
  • Various Chemistry Research Topics
  • Malaria Research and Control
  • Metabolomics and Mass Spectrometry Studies
  • Mosquito-borne diseases and control
  • Click Chemistry and Applications
  • Crystallography and molecular interactions
  • Image Processing and 3D Reconstruction
  • Signaling Pathways in Disease
  • Scientific Research and Discoveries
  • Advanced Proteomics Techniques and Applications
  • Machine Learning and Data Classification
  • Surface Chemistry and Catalysis
  • AI in cancer detection
  • Environmental Impact and Sustainability
  • Advanced Computational Techniques and Applications

Google (United States)
2017-2024

Relay Therapeutics (United States)
2021-2024

Stanford University
2013-2021

We investigate the impact of choosing regressors and molecular representations for construction fast machine learning (ML) models thirteen electronic ground-state properties organic molecules. The performance each regressor/representation/property combination is assessed using curves which report out-of-sample errors as a function training set size with up to $\sim$117k distinct Molecular structures at hybrid density functional theory (DFT) level used testing come from QM9 database...

10.1021/acs.jctc.7b00577 article EN Journal of Chemical Theory and Computation 2017-09-19

We present a framework, which we call Molecule Deep $Q$-Networks (MolDQN), for molecule optimization by combining domain knowledge of chemistry and state-of-the-art reinforcement learning techniques (double $Q$-learning randomized value functions). directly define modifications on molecules, thereby ensuring 100\% chemical validity. Further, operate without pre-training any dataset to avoid possible bias from the choice that set. Inspired problems faced during medicinal lead optimization,...

10.1038/s41598-019-47148-x article EN cc-by Scientific Reports 2019-07-24

We introduce tensor field neural networks, which are locally equivariant to 3D rotations, translations, and permutations of points at every layer. rotation equivariance removes the need for data augmentation identify features in arbitrary orientations. Our network uses filters built from spherical harmonics; due mathematical consequences this filter choice, each layer accepts as input (and guarantees output) scalars, vectors, higher-order tensors, geometric sense these terms. demonstrate...

10.48550/arxiv.1802.08219 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Massively multitask neural architectures provide a learning framework for drug discovery that synthesizes information from many distinct biological sources. To train these at scale, we gather large amounts of data public sources to create dataset nearly 40 million measurements across more than 200 targets. We investigate several aspects the by performing series empirical studies and obtain some interesting results: (1) massively networks predictive accuracies significantly better single-task...

10.48550/arxiv.1502.02072 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Chemical reaction data in journal articles, patents, and even electronic laboratory notebooks are currently stored various formats, often unstructured, which presents a significant barrier to downstream applications, including the training of machine-learning models. We present Open Reaction Database (ORD), an open-access schema infrastructure for structuring sharing organic data, centralized repository. The ORD supports conventional emerging technologies, from benchtop reactions automated...

10.1021/jacs.1c09820 article EN cc-by-nc-nd Journal of the American Chemical Society 2021-11-02

DNA-encoded small molecule libraries (DELs) have enabled discovery of novel inhibitors for many distinct protein targets therapeutic value through screening with up to billions unique molecules. We demonstrate a new approach applying machine learning DEL selection data by identifying active molecules from large commercial collection and virtual library easily synthesizable compounds. train models using only apply automated or automatable filters chemist review restricted the removal...

10.1021/acs.jmedchem.0c00452 article EN Journal of Medicinal Chemistry 2020-06-11

Target class-focused drug discovery has a strong track record in pharmaceutical research, yet public domain data indicate that many members of protein families remain unliganded. Here we present systematic approach to scale up the and characterization small molecule ligands for WD40 repeat (WDR) family. We developed comprehensive suite protocols production, crystallography, biophysical, biochemical, cellular assays. A pilot hit-finding campaign using DNA-encoded chemical library selection...

10.1021/acs.jmedchem.4c02010 article EN cc-by Journal of Medicinal Chemistry 2024-11-04

10.1007/s10822-016-9959-3 article EN Journal of Computer-Aided Molecular Design 2016-08-01

Deep learning methods such as multitask neural networks have recently been applied to ligand-based virtual screening and other drug discovery applications. Using a set of industrial ADMET datasets, we compare standard baseline models analyze effects with both random cross-validation more relevant temporal validation scheme. We confirm that can provide modest benefits over single-task show smaller datasets tend benefit than larger from learning. Additionally, find adding massive amounts side...

10.48550/arxiv.1606.08793 preprint EN other-oa arXiv (Cornell University) 2016-01-01

10.1016/j.trechm.2020.10.012 article EN Trends in Chemistry 2020-11-19

Massively-Multitask Regression Models (MMRMs) trained on millions of compounds and many thousands assays can predict bioactivity with accuracy comparable to 4-concentration IC50 experiments. Recent advances in hardware algorithms have produced a variety methods for multitask modeling. This report compares the performance six MMRM algorithms: Profile-QSAR (pQSAR), Alchemite, meta learner (MetaNN), feed-forward neural network (MT-DNN), Bayesian factorization side information (Macau) Inductive...

10.26434/chemrxiv-2025-2mrbb preprint EN cc-by-nc-nd 2025-03-12

Massively-Multitask Regression Models (MMRMs) trained on millions of compounds and many thousands assays can predict bioactivity with accuracy comparable to 4-concentration IC50 experiments. Recent advances in hardware algorithms have produced a variety methods for multitask modeling. This report compares the performance six MMRM algorithms: Profile-QSAR (pQSAR), Alchemite, meta learner (MetaNN), feed-forward neural network (MT-DNN), Bayesian factorization side information (Macau) Inductive...

10.26434/chemrxiv-2025-2mrbb-v2 preprint EN cc-by-nc-nd 2025-03-14

We present RL-VAE, a graph-to-graph variational autoencoder that uses reinforcement learning to decode molecular graphs from latent embeddings. Methods have been described previously for autoencoding, but these approaches require sophisticated decoders increase the complexity of training and evaluation (such as requiring parallel encoders or non-trivial graph matching). Here, we repurpose simple generator enable efficient decoding generation graphs.

10.48550/arxiv.1904.08915 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Retrosynthesis -- the process of identifying a set reactants to synthesize target molecule is vital importance material design and drug discovery. Existing machine learning approaches based on language models graph neural networks have achieved encouraging results. In this paper, we propose framework that unifies sequence- graph-based methods as energy-based (EBMs) with different energy functions. This unified perspective provides critical insights about EBM variants through comprehensive...

10.48550/arxiv.2007.13437 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Affordable and effective antiviral therapies are needed worldwide, especially against agents such as dengue virus that endemic in underserved regions. Many compounds have been studied cultured cells but unsuitable for clinical applications due to pharmacokinetic profiles, side effects, or inconsistent efficacy across serotypes. Such tool can, however, aid identifying clinically useful treatments. Here, computational screening (Rapid Overlay of Chemical Structures) was used identify entries...

10.1128/mbio.02839-20 article EN mBio 2020-11-09
Coming Soon ...