- Statistical Methods and Inference
- Sparse and Compressive Sensing Techniques
- Stochastic Gradient Optimization Techniques
- Random Matrices and Applications
- Control Systems and Identification
- Game Theory and Voting Systems
- Differential Equations and Boundary Problems
- Advanced X-ray Imaging Techniques
- Theoretical and Computational Physics
- Advanced Statistical Methods and Models
- Distributed Sensor Networks and Detection Algorithms
- Domain Adaptation and Few-Shot Learning
- Soil Geostatistics and Mapping
- Blind Source Separation Techniques
- Stability and Controllability of Differential Equations
- Fault Detection and Control Systems
- Numerical methods in inverse problems
- Statistical Methods and Bayesian Inference
- Aquatic and Environmental Studies
- Machine Learning and Data Classification
- 3D Modeling in Geospatial Applications
- Mental Health Research Topics
- Stochastic processes and statistical mechanics
- Markov Chains and Monte Carlo Methods
- Advanced Mathematical Modeling in Engineering
Princeton University
2016-2021
Columbia University
2020-2021
University of California, Berkeley
2021
University of Michigan
2021
Fudan University
2019
Peking University
2014
Recent years have seen a flurry of activities in designing provably efficient nonconvex procedures for solving statistical estimation problems. Due to the highly nature empirical loss, state-of-the-art often require proper regularization (e.g., trimming, regularized cost, projection) order guarantee fast convergence. For vanilla such as gradient descent, however, prior theory either recommends conservative learning rates avoid overshooting, or completely lacks performance guarantees. This...
Recovering low-rank structures via eigenvector perturbation analysis is a common problem in statistical machine learning, such as factor analysis, community detection, ranking, matrix completion, among others. While large variety of bounds are available for average errors between empirical and population statistics eigenvectors, few results tight entrywise analyses, which critical number problems detection. This paper investigates behaviors eigenvectors class random matrices whose...
This paper is concerned with the problem of top-K ranking from pairwise comparisons. Given a collection n items and few comparisons across them, one wishes to identify set K that receive highest ranks. To tackle this problem, we adopt logistic parametric model - Bradley-Terry-Luce model, where each item assigned latent preference score, outcome comparison depends solely on relative scores two involved. Recent works have made significant progress towards characterizing performance (e.g. mean...
When the data are stored in a distributed manner, direct applications of traditional statistical inference procedures often prohibitive due to communication costs and privacy concerns. This paper develops investigates two Communication-Efficient Accurate Statistical Estimators (CEASE), implemented through iterative algorithms for optimization. In each iteration, node machines carry out computation parallel communicate with central processor, which then broadcasts aggregated information new...
Factor models are a class of powerful statistical that have been widely used to deal with dependent measurements arise frequently from various applications genomics and neuroscience economics finance. As data collected at an ever-growing scale, machine learning faces some new challenges: high dimensionality, strong dependence among observed variables, heavy-tailed variables heterogeneity. High-dimensional robust factor analysis serves as toolkit conquer these challenges. This paper gives...
Consider the linear model (1) y = X β 0 + e , where ( 1 … n ) ⊤ ∈ R is a response vector, x × p design matrix, an unknown coefficient vec...
We study the multi-task learning problem that aims to simultaneously analyze multiple datasets collected from different sources and learn one model for each of them. propose a family adaptive methods automatically utilize possible similarities among those tasks while carefully handling their differences. derive sharp statistical guarantees prove robustness against outlier tasks. Numerical experiments on synthetic real demonstrate efficacy our new methods.
The past two decades have witnessed deep cross-fertilization between the culturesstatistics (data/generative modeling) and machine learning (algorithmic modeling), which is in stark contrast to scene pictured Breiman's inspiring work.In light of this major confluence, we find it helpful single out a few salient examples showcasing impacts one other, research progress them.We point end that current big data era especially requires joint efforts from both cultures order address some common...