- Protein Structure and Dynamics
- Stochastic Gradient Optimization Techniques
- Computational Drug Discovery Methods
- Artificial Intelligence in Healthcare and Education
- Tensor decomposition and applications
- vaccines and immunoinformatics approaches
- Model Reduction and Neural Networks
- AI in cancer detection
- Advanced Neuroimaging Techniques and Applications
- Neural Networks and Applications
- Stochastic processes and financial applications
- Fluid Dynamics Simulations and Interactions
- Advanced Statistical Methods and Models
- Energy Load and Power Forecasting
- Machine Learning in Healthcare
- Topic Modeling
- Bayesian Methods and Mixture Models
- Advanced Neural Network Applications
- Adversarial Robustness in Machine Learning
- Gaussian Processes and Bayesian Inference
- Machine Learning in Materials Science
- Anomaly Detection Techniques and Applications
- Radiomics and Machine Learning in Medical Imaging
- Machine Learning and Data Classification
- COVID-19 diagnosis using AI
DeepMind (United Kingdom)
2022-2025
Boğaziçi University
2021-2025
Google (United Kingdom)
2025
Sabancı Üniversitesi
2019
Grokking, the sudden generalization that occurs after prolonged overfitting, is a surprising phenomenon challenging our understanding of deep learning. Although significant progress has been made in grokking, reasons behind delayed and its dependence on regularization remain unclear. In this work, we argue without regularization, grokking tasks push models to edge numerical stability, introducing floating point errors Softmax function, which refer as Collapse (SC). We demonstrate SC prevents...
Understanding the generalization properties of optimization algorithms under heavy-tailed noise has gained growing attention. However, existing theoretical results mainly focus on stochastic gradient descent (SGD) and analysis optimizers beyond SGD is still missing. In this work, we establish bounds for with momentum (SGDm) noise. We first consider continuous-time limit SGDm, i.e., a Levy-driven differential equation (SDE), quantitative Wasserstein algorithmic stability class potentially...
Neural network compression techniques have become increasingly popular as they can drastically reduce the storage and computation requirements for very large networks. Recent empirical studies illustrated that even simple pruning strategies be surprisingly effective, several theoretical shown compressible networks (in specific senses) should achieve a low generalization error. Yet, characterization of underlying cause makes amenable to such schemes is still missing. In this study, we address...
In this article, we introduce a dynamic generative model, the Bayesian allocation model (BAM), for modeling count data. BAM covers various probabilistic nonnegative tensor factorization (NTF) and topic models under one general framework. BAM, allocations are made using network, whose conditional probability tables can be integrated out analytically. We show that, when viewed as sequential, resulting marginal process is special type of Polya urn process, which name Polya-Bayes an integer...
Statistical models that accurately predict the binding affinity of an input ligand-protein pair can greatly accelerate drug discovery. Such are trained on available interaction data sets, which may contain biases lead predictor to learn set-specific, spurious patterns instead generalizable relationships. This leads prediction performances these drop dramatically for previously unseen biomolecules. Various approaches aim improve model generalizability either have limited applicability or...
Abstract Diagnostic AI systems trained using deep learning have been shown to achieve expert-level identification of diseases in multiple medical imaging settings 1,2 . However, such are not always reliable and can fail cases diagnosed accurately by clinicians vice versa 3 Mechanisms for leveraging this complementarity select optimally between discordant decisions AIs remained largely unexplored healthcare 4 , yet the potential levels performance that exceed possible from either or clinician...
We introduce a dynamic generative model, Bayesian allocation model (BAM), which establishes explicit connections between nonnegative tensor factorization (NTF), graphical models of discrete probability distributions and their extensions, the topic such as latent Dirichlet allocation. BAM is based on Poisson process, whose events are marked by using network, where conditional tables this network then integrated out analytically. show that resulting marginal process turns to be Polya urn, an...
Recent studies have shown that heavy tails can emerge in stochastic optimization and the heaviness of links to generalization error. While these shed light on interesting aspects behavior modern settings, they relied strong topological statistical regularity assumptions, which are hard verify practice. Furthermore, it has been empirically illustrated relation between might not always be monotonic practice, contrary conclusions existing theory. In this study, we establish novel tail...
For safety, AI systems in health undergo thorough evaluations before deployment, validating their predictions against a ground truth that is assumed certain. However, this actually not the case and may be uncertain. Unfortunately, largely ignored standard evaluation of models but can have severe consequences such as overestimating future performance. To avoid this, we measure effects uncertainty, which assume decomposes into two main components: annotation uncertainty stems from lack...
Robust generalization of drug-target affinity (DTA) prediction models is a notoriously difficult problem in computational drug discovery. In this article, we present pydebiaseddta: software for improving the generalizability DTA to novel ligands and/or proteins. pydebiaseddta serves as practical implementation DebiasedDTA training framework, which advocates modifying distribution mitigate effect spurious correlations data set that leads substantially degraded performance and Written Python...
Computational models that accurately predict the binding affinity of an input protein-chemical pair can accelerate drug discovery studies. These are trained on available interaction datasets, which may contain dataset biases lead model to learn dataset-specific patterns, instead generalizable relationships. As a result, prediction performance drops for previously unseen biomolecules, $\textit{i.e.}$ cannot generalize biomolecules outside dataset. The latest approaches aim improve...