Yao Zhang

ORCID: 0000-0003-3780-9711
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Causal Inference Techniques
  • Statistical Methods and Inference
  • Advanced Battery Technologies Research
  • Bayesian Modeling and Causal Inference
  • Machine Learning and Data Classification
  • Advancements in Battery Materials
  • Health Systems, Economic Evaluations, Quality of Life
  • Fault Detection and Control Systems
  • Machine Learning and Algorithms
  • Machine Learning in Healthcare
  • Stochastic Gradient Optimization Techniques
  • Domain Adaptation and Few-Shot Learning
  • Adversarial Robustness in Machine Learning
  • Advanced Battery Materials and Technologies
  • Model Reduction and Neural Networks
  • Computational Drug Discovery Methods
  • Machine Learning in Materials Science
  • Gaussian Processes and Bayesian Inference
  • Statistical Methods in Clinical Trials
  • Advanced Data Processing Techniques
  • Text and Document Classification Technologies
  • Speech Recognition and Synthesis
  • Protein Structure and Dynamics
  • Intelligent Tutoring Systems and Adaptive Learning
  • Data Quality and Management

University of Cambridge
2018-2021

Shandong University
2020

Abstract Forecasting the state of health and remaining useful life Li-ion batteries is an unsolved challenge that limits technologies such as consumer electronics electric vehicles. Here, we build accurate battery forecasting system by combining electrochemical impedance spectroscopy (EIS)—a real-time, non-invasive information-rich measurement hitherto underused in diagnosis—with Gaussian process machine learning. Over 20,000 EIS spectra commercial are collected at different states health,...

10.1038/s41467-020-15235-7 article EN cc-by Nature Communications 2020-04-06

We report a statistically principled method to quantify the uncertainty of machine learning models for molecular properties prediction. show that this estimate can be used judiciously design experiments.

10.1039/c9sc00616h article EN cc-by Chemical Science 2019-01-01

The choice of making an intervention depends on its potential benefit or harm in comparison to alternatives. Estimating the likely outcome alternatives from observational data is a challenging problem as all outcomes are never observed, and selection bias precludes direct differently intervened groups. Despite their empirical success, we show that algorithms learn domain-invariant representations inputs (on which make predictions) often inappropriate, develop generalization bounds...

10.48550/arxiv.2001.04754 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Finding parameters that minimise a loss function is at the core of many machine learning methods. The Stochastic Gradient Descent (SGD) algorithm widely used and delivers state-of-the-art results for problems. Nonetheless, SGD typically cannot find global minimum, thus its empirical effectiveness hitherto mysterious. We derive correspondence between parameter inference free energy minimisation in statistical physics. degree undersampling plays role temperature. Analogous to energy–entropy...

10.1080/00268976.2018.1483535 article EN Molecular Physics 2018-06-22

Deep neural networks are workhorse models in machine learning with multiple layers of nonlinear functions composed series. Their loss function is highly nonconvex, yet empirically even gradient descent minimization sufficient to arrive at accurate and predictive models. It hitherto unknown why deep easily optimizable. We analyze the energy landscape a spin glass model using random matrix theory algebraic geometry. analytically show that multilayered structure holds key optimizability: Fixing...

10.1103/physrevlett.124.108301 article EN Physical Review Letters 2020-03-10

Regularization improves generalization of supervised models to out-of-sample data. Prior works have shown that prediction in the causal direction (effect from cause) results lower testing error than anti-causal direction. However, existing regularization methods are agnostic causality. We introduce Causal Structure Learning (CASTLE) and propose regularize a neural network by jointly learning relationships between variables. CASTLE learns directed acyclical graph (DAG) as an adjacency matrix...

10.48550/arxiv.2009.13180 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Forecasting the state of health and remaining useful life Li-ion batteries is an unsolved challenge that limits technologies such as consumer electronics electric vehicles. Here, we build accurate battery forecasting system by combining electrochemical impedance spectroscopy (EIS)—a real-time, non-invasive information-rich measurement hitherto underused in diagnosis—with Gaussian process machine learning. Over 20,000 EIS spectra commercial are collected at different states health, charge...

10.17863/cam.52181 article EN 2020-04-06

Machine Learning has proved its ability to produce accurate models but the deployment of these outside machine learning community been hindered by difficulties interpreting models. This paper proposes an algorithm that produces a continuous global interpretation any given black-box function. Our employs variation projection pursuit in which ridge functions are chosen be Meijer G-functions, rather than usual polynomial splines. Because G-functions differentiable their parameters, we can tune...

10.48550/arxiv.2011.08596 preprint EN cc-by arXiv (Cornell University) 2020-01-01

An essential problem in automated machine learning (AutoML) is that of model selection. A unique challenge the sequential setting fact optimal itself may vary over time, depending on distribution features and labels available up to each point time. In this paper, we propose a novel Bayesian optimization (BO) algorithm tackle selection setting. This accomplished by treating performance at time step as its own black-box function. order solve resulting multiple function jointly efficiently,...

10.48550/arxiv.2001.03898 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Predicting bioactivity and physical properties of small molecules is a central challenge in drug discovery. Deep learning becoming the method choice but studies to date focus on mean accuracy as main metric. However, replace costly mission-critical experiments by models, high not enough: outliers can derail discovery campaign, thus models need reliably predict when it will fail, even training data biased; are expensive, be data-efficient suggest informative sets using active learning. We...

10.17863/cam.48436 article EN 2019-07-10
Coming Soon ...