Dao Nguyen

ORCID: 0000-0003-2215-613X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Statistical Methods and Inference
  • Bayesian Methods and Mixture Models
  • Markov Chains and Monte Carlo Methods
  • Gaussian Processes and Bayesian Inference
  • Face and Expression Recognition
  • Bayesian Modeling and Causal Inference
  • Sparse and Compressive Sensing Techniques
  • Target Tracking and Data Fusion in Sensor Networks
  • Advanced Statistical Methods and Models
  • Statistical Methods and Bayesian Inference
  • COVID-19 epidemiological studies
  • Blind Source Separation Techniques
  • Financial Risk and Volatility Modeling
  • Stochastic Gradient Optimization Techniques
  • Sensory Analysis and Statistical Methods
  • Forecasting Techniques and Applications
  • Cognitive Science and Mapping
  • Medical Image Segmentation Techniques
  • Complex Systems and Time Series Analysis
  • Point processes and geometric inequalities
  • Speech and Audio Processing
  • Control Systems and Identification
  • Neural Networks and Applications
  • Domain Adaptation and Few-Shot Learning

Marist College
2024

University of Mississippi
2018-2023

University of California, Berkeley
2019

University of Michigan
2016

Stanford University
2008

Partially observed Markov process (POMP) models, also known as hidden models or state space are ubiquitous tools for time series analysis. The R package pomp provides a very flexible framework Monte Carlo statistical investigations using nonlinear, non-Gaussian POMP models. A range of modern methods have been implemented in this including sequential Carlo, iterated filtering, particle chain approximate Bayesian computation, maximum synthetic likelihood estimation, nonlinear forecasting, and...

10.18637/jss.v069.i12 article EN cc-by Journal of Statistical Software 2016-01-01

Iterated filtering algorithms are stochastic optimization procedures for latent variable models that recursively combine parameter perturbations with reconstruction. Previously, theoretical support these has been based on the use of conditional moments perturbed parameters to approximate derivatives log likelihood function. Here, a approach is introduced convergence an iterated Bayes map. An algorithm supported by this theory displays substantial numerical improvement computational challenge...

10.1073/pnas.1410597112 article EN Proceedings of the National Academy of Sciences 2015-01-07

Identifying statistical dependence between the features and label is a fundamental problem in supervised learning. This paper presents framework for estimating numerical categorical using generalized Gini distance, an energy distance reproducing kernel Hilbert spaces (RKHS). Two based measures are explored: covariance correlation. Unlike Pearson correlation, which do not characterize independence, above define as well independence of random variables. The test statistics simple to calculate...

10.1109/tpami.2019.2960358 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-12-17

Abstract We propose a new Gini correlation to measure dependence between categorical and numerical variables. Analogous Pearson R 2 in ANOVA model, the is interpreted as ratio of between‐group variation total variation, but it characterizes independence (zero mutually implies independence). Closely related distance correlation, simple formulation by considering nature variable. As result, proposed has simpler computation implementation than more straightforward perform inference. Simulation...

10.1111/sjos.12490 article EN Scandinavian Journal of Statistics 2020-09-04

nimble is an R package for constructing algorithms and conducting inference on hierarchical models. The provides a unique combination of flexible model specification the ability to program model-generic algorithms. Specifically, allows users code models in BUGS language, it write that can be applied any appropriate model. In this paper, we introduce nimbleSMC package. contains state-space analysis using sequential Monte Carlo (SMC) techniques are built nimble. We first provide overview...

10.18637/jss.v100.i03 article EN cc-by Journal of Statistical Software 2021-01-01

10.1016/j.spl.2016.05.013 article EN publisher-specific-oa Statistics & Probability Letters 2016-05-27

10.1007/s11222-016-9711-9 article EN Statistics and Computing 2016-10-15

10.1007/s40304-023-00350-w article EN Communications in Mathematics and Statistics 2023-12-09

nimble is an R package for constructing algorithms and conducting inference on hierarchical models. The provides a unique combination of flexible model specification the ability to program model-generic algorithms. Specifically, allows users code models in BUGS language, it write that can be applied any appropriate model. In this paper, we introduce nimble's capabilities state-space analysis using sequential Monte Carlo (SMC) techniques. We first provide overview commonly-used SMC then...

10.48550/arxiv.1703.06206 preprint EN other-oa arXiv (Cornell University) 2017-01-01

We propose a new Gini correlation to measure dependence between categorical and numerical variables. Analogous Pearson $R^2$ in ANOVA model, the is interpreted as ratio of between-group variation total variation, but it characterizes independence (zero mutually implies independence). Closely related distance correlation, simple formulation by considering nature variable. As result, proposed has lower computational cost than more straightforward perform inference. Simulation real applications...

10.48550/arxiv.1809.09793 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Discretization of continuous-time diffusion processes is a widely recognized method for sampling. However, it seems to be considerable restriction when the potentials are often required smooth (gradient Lipschitz). This paper studies problem sampling through Euler discretization, where potential function assumed mixture weakly distributions and satisfies dissipative. We establish convergence in Kullback–Leibler (KL) divergence with number iterations reach ϵ-neighborhood target distribution...

10.1214/22-bjps538 article EN Brazilian Journal of Probability and Statistics 2022-09-01

This paper introduces a method for efficiently approximating the inverse of Fisher information matrix, crucial step in achieving effective variational Bayes inference. A notable aspect our approach is avoidance analytically computing matrix and its explicit inversion. Instead, we introduce an iterative procedure generating sequence matrices that converge to information. The natural gradient algorithm without analytic expression inversion provably convergent achieves convergence rate order O(...

10.1080/01621459.2024.2392904 article EN cc-by-nc-nd Journal of the American Statistical Association 2024-08-22

Markov chain Monte Carlo (MCMC) methods are ubiquitous tools for simulation-based inference in many fields but designing and identifying good MCMC samplers is still an open question. This paper introduces a novel algorithm, namely, Nested Adaptation MCMC. For sampling variables or blocks of variables, we use two levels adaptation where the inner optimizes performance within each sampler, while outer explores space valid kernels to find optimal samplers. We provide theoretical foundation our...

10.1214/19-ba1190 article EN Bayesian Analysis 2019-12-09

In simulation-based inferences for partially observed Markov process models (POMP), the by-product of Monte Carlo filtering is an approximation log likelihood function. Recently, iterated [14, 13] has originally been introduced and it shown that gradient can also be approximated. Consequently, different stochastic optimization algorithm applied to estimate parameters underlying models. As accelerated efficient approach in literature, we show accelerate same manner inherit high convergence...

10.48550/arxiv.1802.08613 preprint EN cc-by arXiv (Cornell University) 2018-01-01

Discretization of continuous-time diffusion processes is a widely recognized method for sampling. However, the canonical Euler-Maruyama discretization Langevin process, also named as Monte Carlo (LMC), studied mostly in context smooth (gradient-Lipschitz) and strongly log-concave densities, significant constraint its deployment many sciences, including computational statistics statistical learning. In this paper, we establish several theoretical contributions to literature on such sampling...

10.48550/arxiv.2002.10071 preprint EN cc-by arXiv (Cornell University) 2020-01-01

Discretization of continuous-time diffusion processes is a widely recognized method for sampling. However, the canonical Euler Maruyama discretization Langevin process, referred as Unadjusted Algorithm (ULA), studied mostly in context smooth (gradient Lipschitz) and strongly log-concave densities, considerable hindrance its deployment many sciences, including statistics machine learning. In this paper, we establish several theoretical contributions to literature on such sampling methods...

10.48550/arxiv.2101.06369 preprint EN cc-by arXiv (Cornell University) 2021-01-01

Simulation-based inferences have attracted much attention in recent years, as the direct computation of likelihood function many real-world problems is difficult or even impossible. Iterated filtering (Ionides, Bretó, and King 2006; Ionides, Bhadra, Atchadé,and 2011) enables maximization via model perturbations approximation gradient loglikelihood through sequential Monte Carlo filtering. By an application Stein’s identity, Doucet, Jacob, Rubenthaler (2013) developed asecond-order...

10.17713/ajs.v52i4.1503 article EN cc-by Austrian Journal of Statistics 2023-07-19

This paper presents an approach for efficiently approximating the inverse of Fisher information, a key component in variational Bayes inference. A notable aspect our is avoidance analytically computing information matrix and its explicit inversion. Instead, we introduce iterative procedure generating sequence matrices that converge to information. The natural gradient algorithm without inversion provably convergent achieves convergence rate order O(log s/s), with s number iterations. We also...

10.48550/arxiv.2312.09633 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Markov chain Monte Carlo (MCMC) methods are ubiquitous tools for simulation-based inference in many fields but designing and identifying good MCMC samplers is still an open question. This paper introduces a novel algorithm, namely, Auto Adapt MCMC. For sampling variables or blocks of variables, we use two levels adaptation where the inner optimizes performance within each sampler, while outer explores valid space kernels to find optimal samplers. We provide theoretical foundation our...

10.48550/arxiv.1802.08798 preprint EN cc-by arXiv (Cornell University) 2018-01-01

Partially observed Markov process (POMP) models are powerful tools for time series modeling and analysis. Inherited the flexible framework of R package pomp, is2 extends some useful Monte Carlo statistical methodologies to improve on convergence rates. A variety efficient methods POMP have been developed including fixed lag smoothing, second-order iterated momentum filtering, average accelerate filtering particle filtering. In this paper, we show utility these based two toy problems. We also...

10.48550/arxiv.1811.02963 preprint EN cc-by arXiv (Cornell University) 2018-01-01
Coming Soon ...