NFDI4DS | UHH-SEMS - Publication Details

Jason D. Lee

ORCID: 0000-0003-0064-7800

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5059740024

Research Areas

Stochastic Gradient Optimization Techniques
Sparse and Compressive Sensing Techniques
Reinforcement Learning in Robotics
Neural Networks and Applications
Machine Learning and Algorithms
Advanced Bandit Algorithms Research
Model Reduction and Neural Networks
Statistical Methods and Inference
Machine Learning and ELM
Domain Adaptation and Few-Shot Learning
Advanced Neural Network Applications
Adversarial Robustness in Machine Learning
Machine Learning and Data Classification
Topic Modeling
Markov Chains and Monte Carlo Methods
Advanced Optimization Algorithms Research
Gaussian Processes and Bayesian Inference
Natural Language Processing Techniques
Matrix Theory and Algorithms
Generative Adversarial Networks and Image Synthesis
Adaptive Dynamic Programming Control
Bayesian Modeling and Causal Inference
Systemic Lupus Erythematosus Research
Ferroelectric and Negative Capacitance Devices
Interconnection Networks and Systems

Princeton University
2017-2025

GS Caltex (South Korea)
2023

Harvard University
2023

Thomas Jefferson University
2023

Princeton Public Schools
2019-2021

University of Southern California
2016-2020

Kaiser Permanente
2020

Georgia Institute of Technology
2020

Southern California University for Professional Studies
2016-2019

LAC+USC Medical Center
2018-2019

Exact post-selection inference, with application to the lasso

OPENALEX - Publications

Jason D. Lee Dennis L. Sun Yuekai Sun Jonathan Taylor

We develop a general approach to valid inference after model selection. At the core of our framework is result that characterizes distribution post-selection estimator conditioned on selection event. specialize by lasso form confidence intervals for selected coefficients and test whether all relevant variables have been included in model.

10.1214/15-aos1371 article EN other-oa The Annals of Statistics 2016-04-11

Fully Character-Level Neural Machine Translation without Explicit Segmentation

OPENALEX - Publications

Jason D. Lee Kyunghyun Cho Thomas Hofmann

Most existing machine translation systems operate at the level of words, relying on explicit segmentation to extract tokens. We introduce a neural (NMT) model that maps source character sequence target without any segmentation. employ character-level convolutional network with max-pooling encoder reduce length representation, allowing be trained speed comparable subword-level models while capturing local regularities. Our character-to-character outperforms recently proposed baseline WMT’15...

10.1162/tacl_a_00067 article EN cc-by Transactions of the Association for Computational Linguistics 2017-12-01

Communication-Efficient Distributed Statistical Inference

OPENALEX - Publications

Michael I. Jordan Jason D. Lee Yun Yang

We present a communication-efficient surrogate likelihood (CSL) framework for solving distributed statistical inference problems. CSL provides to the global that can be used low-dimensional estimation, high-dimensional regularized and Bayesian inference. For provably improves upon naive averaging schemes facilitates construction of confidence intervals. leads minimax-optimal estimator with controlled communication cost. inference, form quasi-posterior distribution converges true posterior....

10.1080/01621459.2018.1429274 article EN Journal of the American Statistical Association 2018-02-05

Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks

OPENALEX - Publications

Mahdi Soltanolkotabi Adel Javanmard Jason D. Lee

In this paper, we study the problem of learning a shallow artificial neural network that best fits training data set. We in over-parameterized regime where numbers observations are fewer than number parameters model. show with quadratic activations, optimization landscape training, such networks, has certain favorable characteristics allow globally optimal models to be found efficiently using variety local search heuristics. This result holds for an arbitrary input/output pairs. For...

10.1109/tit.2018.2854560 article EN publisher-specific-oa IEEE Transactions on Information Theory 2018-07-10

Matrix Completion has No Spurious Local Minimum

OPENALEX - Publications

Rong Ge Jason D. Lee Tengyu Ma

Matrix completion is a basic machine learning problem that has wide applications, especially in collaborative filtering and recommender systems. Simple non-convex optimization algorithms are popular effective practice. Despite recent progress proving various converge from good initial point, it remains unclear why random or arbitrary initialization suffices We prove the commonly used objective function for \textit{positive semidefinite} matrix no spurious local minima --- all must also be...

10.48550/arxiv.1605.07272 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Proximal Newton-Type Methods for Minimizing Composite Functions

OPENALEX - Publications

Jason D. Lee Yuekai Sun Michael A. Saunders

We generalize Newton-type methods for minimizing smooth functions to handle a sum of two convex functions: function and nonsmooth with simple proximal mapping. show that the resulting inherit desirable convergence behavior functions, even when search directions are computed inexactly. Many popular tailored problems arising in bioinformatics, signal processing, statistical learning special cases methods, our analysis yields new results some these methods.

10.1137/130921428 article EN SIAM Journal on Optimization 2014-01-01

Gradient Descent Finds Global Minima of Deep Neural Networks

OPENALEX - Publications

Simon S. Du Jason D. Lee Haochuan Li Liwei Wang Xiyu Zhai

Gradient descent finds a global minimum in training deep neural networks despite the objective function being non-convex. The current paper proves gradient achieves zero loss polynomial time for over-parameterized network with residual connections (ResNet). Our analysis relies on particular structure of Gram matrix induced by architecture. This allows us to show is stable throughout process and this stability implies optimality algorithm. We further extend our convolutional obtain similar...

10.48550/arxiv.1811.03804 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Stochastic Subgradient Method Converges on Tame Functions

OPENALEX - Publications

Damek Davis Dmitriy Drusvyatskiy Sham M. Kakade Jason D. Lee

10.1007/s10208-018-09409-5 article EN Foundations of Computational Mathematics 2019-01-07

Learning the Structure of Mixed Graphical Models

OPENALEX - Publications

Jason D. Lee Trevor Hastie

We consider the problem of learning structure a pairwise graphical model over continuous and discrete variables. present new for models with both variables that is amenable to learning. In previous work, authors have considered Gaussian models. Our approach natural generalization these two lines work mixed case. The penalization scheme involves novel symmetric use group-lasso norm follows naturally from particular parametrization model. Supplementary materials this paper are available online.

10.1080/10618600.2014.900500 article EN Journal of Computational and Graphical Statistics 2014-04-04

First-order methods almost always avoid strict saddle points

OPENALEX - Publications

Jason D. Lee Ioannis Panageas Georgios Piliouras Max Simchowitz Michael I. Jordan and 1 more

10.1007/s10107-019-01374-3 article EN Mathematical Programming 2019-02-18

Communication-efficient sparse regression

OPENALEX - Publications

Jason D. Lee Qiang Liu Yuekai Sun Jonathan Taylor

We devise a communication-efficient approach to distributed sparse regression in the high-dimensional setting. The key idea is average debiased or desparsified lasso estimators. show converges at same rate as long dataset not split across too many machines, and consistently estimates support under weaker conditions than lasso. On computational side, we propose new parallel computationally-efficient algorithm compute approximate inverse covariance required debiasing approach, when samples....

10.5555/3122009.3122014 article EN Journal of Machine Learning Research 2017-01-01

Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima

OPENALEX - Publications

Simon S. Du Jason D. Lee Yuandong Tian Barnabás Póczos Aarti Singh

We consider the problem of learning a one-hidden-layer neural network with non-overlapping convolutional layer and ReLU activation, i.e., $f(\mathbf{Z}, \mathbf{w}, \mathbf{a}) = \sum_j a_jσ(\mathbf{w}^T\mathbf{Z}_j)$, in which both weights $\mathbf{w}$ output $\mathbf{a}$ are parameters to be learned. When labels outputs from teacher same architecture fixed $(\mathbf{w}^*, \mathbf{a}^*)$, we prove that Gaussian input $\mathbf{Z}$, there is spurious local minimizer. Surprisingly, presence...

10.48550/arxiv.1712.00779 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Coming Soon ...