- Stochastic Gradient Optimization Techniques
- Sparse and Compressive Sensing Techniques
- Privacy-Preserving Technologies in Data
- Statistical Methods and Inference
- Advanced Optimization Algorithms Research
- Topic Modeling
- Natural Language Processing Techniques
- Multimodal Machine Learning Applications
- Neural Networks and Applications
- Markov Chains and Monte Carlo Methods
- Domain Adaptation and Few-Shot Learning
- Blind Source Separation Techniques
- Machine Learning and Algorithms
- Ferroelectric and Negative Capacitance Devices
- Optimization and Variational Analysis
- Time Series Analysis and Forecasting
- Bayesian Methods and Mixture Models
- Complexity and Algorithms in Graphs
- Advanced Multi-Objective Optimization Algorithms
- Gaussian Processes and Bayesian Inference
- Explainable Artificial Intelligence (XAI)
- Model Reduction and Neural Networks
- Machine Learning and ELM
- Wireless Communication Security Techniques
- Advanced Bandit Algorithms Research
École Polytechnique
2014-2024
Centre de Mathématiques Appliquées
2014-2024
Institut Polytechnique de Paris
2023
National Research University Higher School of Economics
2021
École Normale Supérieure - PSL
2014-2020
École Polytechnique Fédérale de Lausanne
2019
École Normale Supérieure
2017
Institut national de recherche en informatique et en automatique
2014
Time series constitute a challenging data type for machine learning algorithms, due to their highly variable lengths and sparse labeling in practice. In this paper, we tackle challenge by proposing an unsupervised method learn universal embeddings of time series. Unlike previous works, it is scalable with respect length demonstrate the quality, transferability practicability learned representations thorough experiments comparisons. To end, combine encoder based on causal dilated convolutions...
We consider the random-design least-squares regression problem within reproducing kernel Hilbert space (RKHS) framework. Given a stream of independent and identically distributed input/output data, we aim to learn function an RKHS $\mathcal{H}$, even if optimal predictor (i.e., conditional expectation) is not in $\mathcal{H}$. In stochastic approximation framework where estimator updated after each observation, show that averaged unregularized least-mean-square algorithm (a form gradient...
We consider the minimization of a strongly convex objective function given access to unbiased estimates its gradient through stochastic descent (SGD) with constant step size. While detailed analysis was only performed for quadratic functions, we provide an explicit asymptotic expansion moments averaged SGD iterates that outlines dependence on initial conditions, effect noise and size, as well lack convergence in general (nonquadratic) case. For this bring tools from Markov chain theory into...
Federated Learning (FL) is a novel approach enabling several clients holding sensitive data to collaboratively train machine learning models, without centralizing data. The cross-silo FL setting corresponds the case of few ($2$--$50$) reliable clients, each medium large datasets, and typically found in applications such as healthcare, finance, or industry. While previous works have proposed representative datasets for cross-device FL, realistic healthcare exist, thereby slowing algorithmic...
Synchronous mini-batch SGD is state-of-the-art for large-scale distributed machine learning. However, in practice, its convergence bottlenecked by slow communication rounds between worker nodes. A natural solution to reduce use the \emph{`local-SGD'} model which workers train their independently and synchronize every once a while. This algorithm improves computation-communication trade-off but not understood very well. We propose non-asymptotic error analysis, enables comparison...
Uncertainty quantification of predictive models is crucial in decision-making problems. Conformal prediction a general and theoretically sound answer. However, it requires exchangeable data, excluding time series. While recent works tackled this issue, we argue that Adaptive Inference (ACI, Gibbs Cand{\`e}s, 2021), developed for distribution-shift series, good procedure series with dependency. We analyse the impact learning rate on its efficiency auto-regressive case. propose parameter-free...
We consider the random-design least-squares regression problem within reproducing kernel Hilbert space (RKHS) framework. Given a stream of independent and identically distributed input/output data, we aim to learn function an RKHS $\mathcal{H}$, even if optimal predictor (i.e., conditional expectation) is not in $\mathcal{H}$. In stochastic approximation framework where estimator updated after each observation, show that averaged unregularized least-mean-square algorithm (a form gradient),...
Federated Learning (FL) is a paradigm for large-scale distributed learning which faces two key challenges: (i) efficient training from highly heterogeneous user data, and (ii) protecting the privacy of participating users. In this work, we propose novel FL approach (DP-SCAFFOLD) to tackle these challenges together by incorporating Differential Privacy (DP) constraints into popular SCAFFOLD algorithm. We focus on challenging setting where users communicate with "honest-but-curious" server...
Compression schemes have been extensively used in Federated Learning (FL) to reduce the communication cost of distributed learning. While most approaches rely on a bounded variance assumption noise produced by compressor, this paper investigates use compression and aggregation that produce specific error distribution, e.g., Gaussian or Laplace, aggregated data. We present analyze different based layered quantizers achieving exact distribution. provide methods leverage proposed obtain...
PEPit is a Python package aiming at simplifying the access to worst-case analyses of large family first-order optimization methods possibly involving gradient, projection, proximal, or linear oracles, along with their approximate, Bregman variants. In short, enabling computer-assisted methods. The key underlying idea cast problem performing analysis, often referred as performance estimation (PEP), semidefinite program (SDP) which can be solved numerically. For doing that, users are only...
Stochastic Approximation ( <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">SA</b> ) is a classical algorithm that has had since the early days huge impact on signal processing, and nowadays machine learning, due to necessity deal with large amount of data observed uncertainties. An exemplar special case pertains popular stochastic (sub)gradient which working horse behind many important applications. A lesser-known fact scheme also extends...
First-order optimization methods have attracted a lot of attention due to their practical success in many applications, including machine learning. Obtaining convergence guarantees and worst-case performance certificates for first-order become crucial understanding ingredients underlying efficient developing new ones. However, obtaining, verifying, proving such is often tedious task. Therefore, few approaches were proposed rendering this task more systematic, even partially automated. In...
We introduce a framework - Artemis to tackle the problem of learning in distributed or federated setting with communication constraints and device partial participation. Several workers (randomly sampled) perform optimization process using central server aggregate their computations. To alleviate cost, allows compress information sent both directions (from conversely) combined memory mechanism. It improves on existing algorithms that only consider unidirectional compression (to server), use...