- Stochastic Gradient Optimization Techniques
- Sparse and Compressive Sensing Techniques
- Adversarial Robustness in Machine Learning
- Advanced Bandit Algorithms Research
- Privacy-Preserving Technologies in Data
- Anomaly Detection Techniques and Applications
- Advanced Neural Network Applications
- Machine Learning and Algorithms
- Time Series Analysis and Forecasting
- Markov Chains and Monte Carlo Methods
- Advanced Optimization Algorithms Research
- Machine Learning and ELM
- Neural Networks and Applications
- Risk and Portfolio Optimization
- Advanced Queuing Theory Analysis
- Additive Manufacturing and 3D Printing Technologies
- Fault Detection and Control Systems
- Reinforcement Learning in Robotics
- Complexity and Algorithms in Graphs
- Explainable Artificial Intelligence (XAI)
- Additive Manufacturing Materials and Processes
- Machine Learning and Data Classification
- Multi-Criteria Decision Making
- Advanced Multi-Objective Optimization Algorithms
- Model Reduction and Neural Networks
IBM Research - Thomas J. Watson Research Center
2018-2025
IBM (United States)
2019-2021
Brandenburg University of Technology Cottbus-Senftenberg
2018-2021
ETH Zurich
2021
University of South Carolina
2020
Hanoi University of Science and Technology
2019
Lehigh University
2016-2018
Hanoi Pedagogical University 2
2018
Thang Long University
2018
In this paper, we propose a StochAstic Recursive grAdient algoritHm (SARAH), as well its practical variant SARAH+, novel approach to the finite-sum minimization problems. Different from vanilla SGD and other modern stochastic methods such SVRG, S2GD, SAG SAGA, SARAH admits simple recursive framework for updating gradient estimates; when comparing SAG/SAGA, does not require storage of past gradients. The linear convergence rate is proven under strong convexity assumption. We also prove (in...
In this paper, we study and analyze the mini-batch version of StochAstic Recursive grAdient algoritHm (SARAH), a method employing stochastic recursive gradient, for solving empirical loss minimization case nonconvex losses. We provide sublinear convergence rate (to stationary points) general functions linear gradient dominated functions, both which have some advantages compared to other modern algorithms
Abstract: There is an urgent need to reduce the growing backlog of forensic examinations in Digital Forensics Laboratories (DFLs). Currently, DFLs routinely create duplicates and perform in‐depth all submitted media. This approach rapidly becoming untenable as more cases involve increasing quantities digital evidence. A efficient effective three‐tiered strategy for performing will enable produce useful results a timely manner at different phases investigation, unnecessary expenditure...
We propose a new stochastic first-order algorithmic framework to solve composite nonconvex optimization problems that covers both finite-sum and expectation settings. Our algorithms rely on the SARAH estimator introduced in (Nguyen et al, 2017) consist of two steps: proximal gradient an averaging step making them different from existing proximal-type algorithms. The only require average smoothness assumption objective term additional bounded variance if applied problems. They work with...
Stochastic gradient descent (SGD) is the optimization algorithm of choice in many machine learning applications such as regularized empirical risk minimization and training deep neural networks. The classical convergence analysis SGD carried out under assumption that norm stochastic uniformly bounded. While this might hold for some loss functions, it always violated cases where objective function strongly convex. In (Bottou et al.,2016), a new performed gradients are bounded with respect to...
In this paper, we propose a unified convergence analysis for class of generic shuffling-type gradient methods solving finite-sum optimization problems. Our works with any sampling without replacement strategy and covers many known variants such as randomized reshuffling, deterministic or single permutation, cyclic incremental schemes. We focus on two different settings: strongly convex nonconvex problems, but also discuss the non-strongly case. main contribution consists new non-asymptotic...
We develop and analyse a variant of the SARAH algorithm, which does not require computation exact gradient. Thus this new method can be applied to general expectation minimization problems rather than only finite sum problems. While original as well its predecessor, SVRG, requires an gradient on each outer iteration, inexact (iSARAH), we here, stochastic computed mini-batch sufficient size. The proposed combines variance reduction via sample size selection iterative updates. convergence rate...
Wire-arc additive manufacturing (WAAM) has received substantial attention in recent years due to the very high build rates. When bulky structures are generated using standard layer-by-layer tool paths, rate outer contour of part may lag behind interior. In WAAM, profile a single weld bead resembles parabola. order keep constant at each point layer, optimal overlapping distances can be determined. This paper presents novel multi-bead models for path generation. Mathematical established...
Clustering is a popular unsupervised learning tool often used to discover groups within larger population such as customer segments, or patient subtypes. However, despite its use for subgroup discovery and description few state-of-the-art algorithms provide any rationale behind the clusters found. We propose novel approach interpretable clustering that both data points constructs polytopes around discovered explain them. Our framework allows additional constraints on including ensuring...