Yuling Yao

ORCID: 0000-0002-0985-7233
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Statistical Methods and Bayesian Inference
  • Gaussian Processes and Bayesian Inference
  • Statistical Methods and Inference
  • Bayesian Methods and Mixture Models
  • Heavy metals in environment
  • COVID-19 epidemiological studies
  • Markov Chains and Monte Carlo Methods
  • Radioactivity and Radon Measurements
  • Simulation and Modeling Applications
  • Advanced Statistical Methods and Models
  • COVID-19 Pandemic Impacts
  • Air Quality and Health Impacts
  • Evaluation and Optimization Models
  • Bayesian Modeling and Causal Inference
  • Scientific Computing and Data Management
  • Heavy Metal Exposure and Toxicity
  • Mental Health Research Topics
  • Behavioral Health and Interventions
  • Evaluation Methods in Various Fields
  • Economic Zones and Regional Development
  • Infrastructure Maintenance and Monitoring
  • Data Quality and Management
  • Probability and Statistical Research
  • Resource-Constrained Project Scheduling
  • Domain Adaptation and Few-Shot Learning

Flatiron Health (United States)
2021-2024

Flatiron Institute
2021-2023

Columbia University
2017-2021

Faculty of 1000 (United States)
2018

Google (United States)
2017

Chang'an University
2006-2008

Yong In University
2006

The widely recommended procedure of Bayesian model averaging is flawed in the M-open setting which true data-generating process not one candidate models being fit. We take idea stacking from point estimation literature and generalize to combination predictive distributions, extending utility function any proper scoring rule, using Pareto smoothed importance sampling efficiently compute required leave-one-out posterior distributions regularization get more stability. compare several...

10.1214/17-ba1091 article EN Bayesian Analysis 2018-01-16

The Bayesian approach to data analysis provides a powerful way handle uncertainty in all observations, model parameters, and structure using probability theory. Probabilistic programming languages make it easier specify fit models, but this still leaves us with many options regarding constructing, evaluating, these along remaining challenges computation. Using inference solve real-world problems requires not only statistical skills, subject matter knowledge, programming, also awareness of...

10.48550/arxiv.2011.01808 preprint EN cc-by arXiv (Cornell University) 2020-01-01

The non-Gaussian spatial distribution of galaxies traces the large-scale structure Universe and therefore constitutes a prime observable to constrain cosmological parameters. We conduct Bayesian inference <a:math xmlns:a="http://www.w3.org/1998/Math/MathML" display="inline"><a:mi mathvariant="normal">Λ</a:mi><a:mi>CDM</a:mi></a:math> parameters <d:math xmlns:d="http://www.w3.org/1998/Math/MathML" display="inline"><d:msub><d:mi mathvariant="normal">Ω</d:mi><d:mi>m</d:mi></d:msub></d:math>,...

10.1103/physrevd.109.083535 article EN cc-by Physical review. D/Physical review. D. 2024-04-30

In an earlier article in this journal, Gronau and Wagenmakers (2018) discuss some problems with leave-one-out cross-validation (LOO) for Bayesian model selection. However, the variant of LOO that is at odds a long literature on how to use well. discussion, we practical data analysis, from perspective need abandon idea there device will produce single-number decision rule.

10.1007/s42113-018-0020-6 article EN cc-by Computational Brain & Behavior 2018-11-30

We applied three Bayesian methods to reanalyse the preregistered contributions Social Psychology special issue 'Replications of Important Results in Psychology' (Nosek & Lakens. 2014 Registered reports: a method increase credibility published results. Soc. Psychol.45, 137-141. (doi:10.1027/1864-9335/a000192)). First, individual-experiment parameter estimation revealed that for directed effect size measures, only out 44 central 95% credible intervals did not overlap with zero and fell...

10.1098/rsos.160426 article EN cc-by Royal Society Open Science 2017-01-01

While it's always possible to compute a variational approximation posterior distribution, it can be difficult discover problems with this approximation. We propose two diagnostic algorithms alleviate problem. The Pareto-smoothed importance sampling (PSIS) gives goodness of fit measurement for joint distributions, while simultaneously improving the error in estimate. simulation-based calibration (VSBC) assesses average performance point estimates.

10.48550/arxiv.1802.02538 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Stacking is a widely used model averaging technique that asymptotically yields optimal predictions among linear averages. We show stacking most effective when predictive performance heterogeneous in inputs, and we can further improve the stacked mixture with hierarchical model. generalize to Bayesian stacking. The weights are varying as function of data, partially-pooled, inferred using inference. incorporate discrete continuous other structured priors, time series longitudinal data. To...

10.1214/21-ba1287 article EN Bayesian Analysis 2021-09-27

When working with multimodal Bayesian posterior distributions, Markov chain Monte Carlo (MCMC) algorithms have difficulty moving between modes, and default variational or mode-based approximate inferences will understate uncertainty. And, even if the most important modes can be found, it is difficult to evaluate their relative weights in posterior. Here we propose an approach using parallel runs of MCMC, variational, inference hit as many separated regions possible then combine these...

10.48550/arxiv.2006.12335 preprint EN other-oa arXiv (Cornell University) 2020-01-01

This article is an invited discussion of the by Gronau and Wagenmakers (2018) that can be found at https://dx.doi.org/10.1007/s42113-018-0011-7.

10.48550/arxiv.1810.05374 preprint EN other-oa arXiv (Cornell University) 2018-01-01

<h3>Importance</h3> A slow or incomplete civil registry makes it impossible to determine excess mortality due COVID-19 and difficult inform policy. <h3>Objective</h3> To quantify the association of pandemic with household income in rural Bangladesh 2020. <h3>Design, Setting, Participants</h3> This repeated survey study is based on an in-person census followed by 2 rounds telephone calls. Data were collected from a sample 135 villages within densely populated 350-km<sup>2</sup>rural area...

10.1001/jamanetworkopen.2021.32777 article EN cc-by-nc-nd JAMA Network Open 2021-11-15

The roof and spire of Notre-Dame cathedral in Paris that caught fire collapsed on 15 April 2019 were covered with 460 t lead (Pb). Government reports documented Pb deposition immediately downwind the a twentyfold increase airborne concentrations at distance 50 km aftermath. For this study, we collected 100 samples surface soil from tree pits, parks, other sites all directions within 1 cathedral. Concentrations measured by X-ray fluorescence range 30 to 9,000 mg/kg across area, higher...

10.1029/2020gh000279 article EN cc-by GeoHealth 2020-07-09

Abstract Every philosophy has holes, and it is the responsibility of proponents a to point out these problems. Here are few holes in Bayesian data analysis: (1) usual rules conditional probability fail quantum realm, (2) flat or weak priors lead terrible inferences about things we care about, (3) subjective incoherent, (4) decision picks wrong model, (5) Bayes factors presence priors, (6) for Cantorian reasons need check our models, but this destroys coherence inference. Some problems...

10.1088/1361-6471/abc3a5 article EN Journal of Physics G Nuclear and Particle Physics 2020-10-21

Two main obstacles preventing the widespread adoption of variational Bayesian neural networks are high parameter overhead that makes them infeasible on large networks, and difficulty implementation, which can be thought as "programming overhead." MC dropout [Gal Ghahramani, 2016] is popular because it sidesteps these obstacles. Nevertheless, often harmful to model performance when used in with batch normalization layers [Li et al., 2018], an indispensable part modern networks. We construct a...

10.48550/arxiv.1905.09453 preprint EN other-oa arXiv (Cornell University) 2019-01-01

ABSTRACT Background Excess mortality has demonstrated under-counting of COVID-19 deaths in many countries but cannot be measured low-income where civil registration is incomplete. Methods Enumerators conducted an in-person census all 16,054 households a sample 135 villages within 350 km 2 region Bangladesh followed by again May and November 2020 over the phone. The date cause any changes household composition, as well income food availability, were recorded. For analysis, we stratify data...

10.1101/2021.05.07.21256865 preprint EN cc-by-nc-nd medRxiv (Cold Spring Harbor Laboratory) 2021-05-12

The normalizing constant plays an important role in Bayesian computation, and there is a large literature on methods for computing or approximating constants that cannot be evaluated closed form. When the varies by orders of magnitude, based importance sampling can require many rounds tuning. We present improved approach using adaptive path sampling, iteratively reducing gaps between base target. Using this strategy, we develop two metastable schemes. They are automated Stan little For...

10.48550/arxiv.2009.00471 preprint EN other-oa arXiv (Cornell University) 2020-01-01

The roof and spire of Notre-Dame cathedral in Paris that caught _re collapsed on April 15, 2019, were covered with 460 tons lead (Pb). Government reports documented Pb deposition immediately downwind the a 20-fold increase airborne concentrations at distance 50 km aftermath. For this study, we collected 100 samples surface soil from tree pits, parks, other sites all directions within 1 cathedral. Concentrations measured by X-ray uorescence range 30 to 9000 mg/kg across area, higher...

10.1002/essoar.10503270.3 preprint EN cc-by-nc-nd 2020-07-03

Variational inference (VI) is a method to approximate the computationally intractable posterior distributions that arise in Bayesian statistics. Typically, VI fits simple parametric distribution target by minimizing an appropriate objective such as evidence lower bound (ELBO). In this work, we present new approach based on principle of score matching, if two are equal then their functions (i.e., gradients log density) at every point support. With this, develop matching VI, iterative...

10.48550/arxiv.2307.07849 preprint EN cc-by arXiv (Cornell University) 2023-01-01

The non-Gaussisan spatial distribution of galaxies traces the large-scale structure Universe and therefore constitutes a prime observable to constrain cosmological parameters. We conduct Bayesian inference $\Lambda$CDM parameters $\Omega_m$, $\Omega_b$, $h$, $n_s$, $\sigma_8$ from BOSS CMASS galaxy sample by combining wavelet scattering transform (WST) with simulation-based approach enabled ${\rm S{\scriptsize IM}BIG}$ forward model. design set reduced WST statistics that leverage symmetries...

10.48550/arxiv.2310.15250 preprint EN other-oa arXiv (Cornell University) 2023-01-01
Coming Soon ...