Dennis L. Sun

ORCID: 0000-0003-0116-2004
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Statistical Methods and Inference
  • Bayesian Methods and Mixture Models
  • Statistics Education and Methodologies
  • Speech and Audio Processing
  • Statistical Methods and Bayesian Inference
  • Data Analysis with R
  • Computational Physics and Python Applications
  • Gaussian Processes and Bayesian Inference
  • Speech Recognition and Synthesis
  • Blind Source Separation Techniques
  • Sparse and Compressive Sensing Techniques
  • Imbalanced Data Classification Techniques
  • Image and Signal Denoising Methods
  • Heavy metals in environment
  • Obesity, Physical Activity, Diet
  • Smoking Behavior and Cessation
  • Music and Audio Processing
  • COVID-19 Clinical Research Studies
  • Innovations in Educational Methods
  • Meta-analysis and systematic reviews
  • Advanced Statistical Process Monitoring
  • PARP inhibition in cancer therapy
  • Photoacoustic and Ultrasonic Imaging
  • Time Series Analysis and Forecasting
  • Vitamin C and Antioxidants Research

Stanford University
2013-2025

California Polytechnic State University
2015-2024

Cal Poly Corporation
2017-2022

Automotive Fuel Cell Cooperation (Canada)
2017

University of California, Berkeley
2016

Intel (United States)
2014

We develop a general approach to valid inference after model selection. At the core of our framework is result that characterizes distribution post-selection estimator conditioned on selection event. specialize by lasso form confidence intervals for selected coefficients and test whether all relevant variables have been included in model.

10.1214/15-aos1371 article EN other-oa The Annals of Statistics 2016-04-11

To perform inference after model selection, we propose controlling the selective type I error; i.e., error rate of a test given that it was performed. By doing so, recover long-run frequency properties among selected hypotheses analogous to those apply in classical (non-adaptive) context. Our proposal is closely related data splitting and has similar intuitive justification, but more powerful. Exploiting theory Lehmann Scheffé (1955), derive most powerful unbiased tests confidence intervals...

10.48550/arxiv.1410.2597 preprint EN other-oa arXiv (Cornell University) 2014-01-01

Non-negative matrix factorization (NMF) is a popular method for learning interpretable features from non-negative data, such as counts or magnitudes. Different cost functions are used with NMF in different applications. We develop an algorithm, based on the alternating direction of multipliers, that tackles problems whose function beta-divergence, broad class divergence functions. derive simple, closed-form updates most commonly beta-divergences. demonstrate experimentally this algorithm has...

10.1109/icassp.2014.6854796 article EN 2014-05-01

Background Evidence of racial/ethnic inequalities in tobacco outlet density is limited by: (1) reliance on studies from single counties or states, (2) attention to spatial dependence, and (3) an unclear theory-based relationship between neighbourhood composition density. Methods In 97 the contiguous USA, we calculated 2012 likely outlets (N=90 407), defined as per 1000 population census tracts (n=17 667). We used 2 regression techniques, a errors approach GeoDa software fitting covariance...

10.1136/jech-2016-208475 article EN Journal of Epidemiology & Community Health 2017-03-01

Supervised and semi-supervised source separation algorithms based on non-negative matrix factorization have been shown to be quite effective. However, they require isolated training examples of one or more sources, which is often difficult obtain. This limits the practical applicability these algorithms. We examine problem efficiently utilizing general data in absence specific examples. Specifically, we propose a method learn universal speech model from corpus show how use this separate...

10.1109/icassp.2013.6637625 article EN IEEE International Conference on Acoustics Speech and Signal Processing 2013-05-01

Retail marketing surveillance research highlights concerns about lower priced cigarettes in neighborhoods with a higher proportion of racial/ethnic minorities but focuses almost exclusively on premium brands. To remedy this gap the literature, current study examines neighborhood variation prices for cheapest and popular brand cigarillos large statewide sample licensed tobacco retailers low-tax state.All 61 local health departments California trained data collectors to conduct observations...

10.1093/ntr/ntx089 article EN Nicotine & Tobacco Research 2017-04-20

Voice activity detection (VAD) in the presence of heavy, nonstationary noise is a challenging problem that has attracted attention recent years. Most modern VAD systems require training on highly specialized data: either labeled mixtures speech and are matched to application, or, at very least, data similar encountered application. Because obtaining can be laborious task practical applications, it desirable for voice detector able perform well any type without need data. In this paper, we...

10.21437/interspeech.2013-204 article EN Interspeech 2022 2013-08-25

Purpose – The purpose of this paper is to provide an example Lean Six Sigma (LSS) application in research and development (R&D) organizations eliminate waste improve systems based on available data that turn improves the innovative environment. Manufacturing R&D involves designing testing concepts taking them into high-volume manufacturing. infrastructure associated with such experimental manufacturing lines ability evaluate result under statistical process control configuration...

10.1108/ijlss-02-2014-0004 article EN International Journal of Lean Six Sigma 2014-10-28

Feedback has a powerful influence on learning, but it is also expensive to provide. In large classes may even be impossible for instructors provide individualized feedback. Peer assessment one way personalized feedback that scales classes. Besides these obvious logistical benefits, been conjectured students learn from the practice of peer assessment. However, this never conclusively demonstrated. Using an online educational platform we developed, conducted in-class matched-set, randomized...

10.1371/journal.pone.0143177 article EN cc-by PLoS ONE 2015-12-18

10.1080/01621459.2024.2421994 article Journal of the American Statistical Association 2025-01-02

Bandwidth extension is the problem of recovering missing bandwidth in audio signals that have been band-passed, typically for compression purposes. One approach has shown to be successful non-negative matrix factorization (NMF). The disadvantage NMF it non-convex and intractable solve general. However, extension, only reconstruction needed not explicit factors. We formulate as a convex optimization problem, propose simple algorithm, demonstrate effectiveness this on practical examples.

10.1109/mlsp.2013.6661924 article EN 2013-09-01

The problem of recovering a signal from the magnitude its short-time Fourier transform (STFT) is longstanding one in audio processing. Existing approaches rely on heuristics that often perform poorly because nonconvexity problem. We introduce formulation lends itself to tractable convex program. observe our method yields better reconstructions than standard Griffin-Lim algorithm. provide an algorithm and discuss practical implementation details, including how can be scaled up larger examples.

10.48550/arxiv.1209.2076 preprint EN other-oa arXiv (Cornell University) 2012-01-01

Formulae display:?Mathematical formulae have been encoded as MathML and are displayed in this HTML version using MathJax order to improve their display. Uncheck the box turn off. This feature requires Javascript. Click on a formula zoom.

10.1080/01621459.2021.1967164 article EN cc-by Journal of the American Statistical Association 2021-08-17

Abstract Introduction US college students smoke hookah and vape nicotine at higher rates than other young adults. Density and/or proximity of lounges shops near colleges has been described, but this study is the first to test whether tobacco retailers spatially cluster campuses. Aims Methods We created linked spatial shapefiles for community 4-year in California with lists lounges, shops, licensed retailers. simulated 100 datasets, placing randomly census tracts proportion population...

10.1093/ntr/ntac007 article EN Nicotine & Tobacco Research 2022-01-06

Simulation is an effective tool for analyzing probability models as well facilitating understanding of concepts in and statistics. Unfortunately, implementing a simulation from scratch often requires users to think about programming issues that are not relevant the itself. We have developed Python package called Symbulate (https://github.com/dlsun/symbulate) which provides user friendly framework conducting simulations involving models. The syntax reflects "language probability" makes it...

10.1080/10691898.2019.1600387 article EN cc-by Journal of Statistics Education 2019-01-02

Background: Coronavirus Disease 2019 (COVID-19) has no known specific treatments. However, there might be in vitro and early clinical data as well evidence from Severe Acute Respiratory Syndrome Middle Eastern that could inform clinicians researchers. This systematic review aims to create priorities for future research of drugs repurposed COVID-19. Methods: will include vitro, animal, studies evaluating the efficacy a list 34 compounds four groups identified previous scoping review. Studies...

10.1101/2020.05.21.20109074 preprint EN cc-by medRxiv (Cold Spring Harbor Laboratory) 2020-05-23

One of the most attractive features R is its linear modeling capabilities. We describe a Python package, salmon, that brings best R's functionality to in Pythonic way - by providing composable objects for specifying and fitting models. This object-oriented design also enables other enhance easeof-use, such as automatic visualizations intelligent model building.

10.18637/jss.v108.i08 article EN cc-by Journal of Statistical Software 2024-01-01

Abstract Background Coronavirus disease 2019 (COVID-19) has no confirmed specific treatments. However, there might be in vitro and early clinical data as well evidence from severe acute respiratory syndrome Middle Eastern that could inform clinicians researchers. This systematic review aims to create priorities for future research of drugs repurposed COVID-19. Methods will include vitro, animal, studies evaluating the efficacy a list 34 compounds 4 groups identified previous scoping review....

10.1186/s13643-021-01693-7 article EN cc-by Systematic Reviews 2021-05-07

Lead (Pb) is one of the most common heavy metal urban soil contaminants with well-known toxicity to humans. This incubation study (2–159 d) compared ability bone meal (BM), potassium hydrogen phosphate (KP), and triple superphosphate (TSP), at phosphorus:lead (P:Pb) molar ratios 7.5:1, 15:1, 22.5:1, reduce bioaccessible Pb in contaminated by Pb-based paint relative control which no P amendment was added. Soil pH Mehlich 3 were measured as a function time amount type amendment. XAS assessed...

10.1016/j.chemosphere.2024.142645 article EN cc-by-nc-nd Chemosphere 2024-06-19

We demonstrate how data fission, a method for creating synthetic replicates from single observations, can be applied to empirical Bayes estimation. This extends recent work on with multiple the classical single-replicate setting. The key insight is that after estimation cast as general regression problem.

10.48550/arxiv.2410.12117 preprint EN arXiv (Cornell University) 2024-10-15
Coming Soon ...