Shengyi He

ORCID: 0000-0001-8521-8639
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Statistical Methods and Inference
  • Simulation Techniques and Applications
  • Iron and Steelmaking Processes
  • Probability and Risk Models
  • Statistical Methods and Bayesian Inference
  • Mineral Processing and Grinding
  • Markov Chains and Monte Carlo Methods
  • Software Reliability and Analysis Research
  • Speech and Audio Processing
  • Risk and Portfolio Optimization
  • Bayesian Methods and Mixture Models
  • Minerals Flotation and Separation Techniques
  • Face recognition and analysis
  • Metallurgical Processes and Thermodynamics
  • Granular flow and fluidized beds
  • Generative Adversarial Networks and Image Synthesis
  • Metal Extraction and Bioleaching
  • Statistical Distribution Estimation and Applications
  • Human Pose and Action Recognition
  • Human Motion and Animation
  • Probabilistic and Robust Engineering Design
  • Insurance, Mortality, Demography, Risk Management
  • CO2 Reduction Techniques and Catalysts
  • Cyclone Separators and Fluid Dynamics
  • Fuel Cells and Related Materials

Columbia University
2020-2024

Institute of Process Engineering
2016-2020

University of Chinese Academy of Sciences
2016-2020

University of Science and Technology of China
2019

BOE Technology Group (China)
2019

Stochastic root-finding problems are fundamental in the fields of operations research and data science. However, when problem involves rare events, crude Monte Carlo can be prohibitively inefficient. Importance sampling (IS) is a commonly used approach, but selecting good IS parameter requires knowledge problem’s solution, which creates circular challenge. In “Adaptive Sampling for Efficient Root Finding Quantile Estimation,” He, Jiang, Lam, Fu propose an adaptive approach to untie this...

10.1287/opre.2023.2484 article EN Operations Research 2023-06-02

Evaluating the reliability of intelligent physical systems against rare safety-critical events poses a huge testing burden for real-world applications. Simulation provides useful platform to evaluate extremal risks these before their deployments. Importance Sampling (IS), while proven be powerful rare-event simulation, faces challenges in handling learning-based due black-box nature that fundamentally undermines its efficiency guarantee, which can lead under-estimation without diagnostically...

10.48550/arxiv.2006.15722 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Rare-event simulation techniques, such as importance sampling (IS), constitute powerful tools to speed up challenging estimation of rare catastrophic events. These techniques often leverage the knowledge and analysis on underlying system structures endow desirable efficiency guarantees. However, black-box problems, especially those arising from recent safety-critical applications AI-driven physical systems, can fundamentally undermine their guarantees lead dangerous under-estimation without...

10.48550/arxiv.2111.02204 preprint EN other-oa arXiv (Cornell University) 2021-01-01

While batching and sectioning have been widely used in simulation, it is open regarding their higher-order coverage behaviors whether one better than the other this regard. We develop techniques to obtain errors for batching. theoretically argue that none of or uniformly terms coverage, but usually has a smaller error when number batches large. also support our theoretical findings via numerical experiments.

10.1109/wsc52266.2021.9715418 article EN 2018 Winter Simulation Conference (WSC) 2021-12-12

Batching methods operate by dividing data into batches and conducting inference aggregating estimates from batched data. These have been used extensively in simulation output analysis and, among other strengths, an advantage is the light computation cost when using a small number of batches. However, under budget constraints, it open to our knowledge which batching approach range alternatives statistically optimal, important guiding procedural configuration. We show that standard batching,...

10.1109/wsc60868.2023.10407948 article EN 2018 Winter Simulation Conference (WSC) 2023-12-10

Uncertainty quantification, by means of confidence interval (CI) construction, has been a fundamental problem in statistics and also important risk-aware decision-making. In this paper, we revisit the basic CI but setting expensive black-box models. This are confined to using low number model runs, without ability obtain auxiliary information such as gradients. case, there exist classical methods based on data splitting, newer suitable resampling. However, while all these resulting CIs have...

10.48550/arxiv.2408.05887 preprint EN arXiv (Cornell University) 2024-08-11

Recently, 2D speaking avatars have increasingly participated in everyday scenarios due to the fast development of facial animation techniques. However, most existing works neglect explicit control human bodies. In this paper, we propose drive not only faces but also torso and gesture movements a figure. Inspired by recent advances diffusion models, Motion-Enhanced Textural-Aware ModeLing for SpeaKing Avatar Reenactment (TALK-Act) framework, which enables high-fidelity avatar reenactment from...

10.48550/arxiv.2410.10696 preprint EN arXiv (Cornell University) 2024-10-14

Lip-syncing videos with given audio is the foundation for various applications including creation of virtual presenters or performers. While recent studies explore high-fidelity lip-sync different techniques, their task-orientated models either require long-term clip-specific training retain visible artifacts. In this paper, we propose a unified and effective framework ReSyncer, that synchronizes generalized audio-visual facial information. The key design revisiting rewiring Style-based...

10.48550/arxiv.2408.03284 preprint EN arXiv (Cornell University) 2024-08-06

Person Re-IDentification (Re-ID) has developed rapidly with deep learning methods, as for these the base mod- els used in most of them are not customized Re-ID task. Although some studies have carefully designed models special task, always easy to be model and expand new methods due their great complexity. In this paper, we propose a novel efficient named Multi-granularity Feature Boosting Network (MFBN). MFBN consists branches information different granularities. combines into one whole, so...

10.1145/3325730.3325764 article EN 2019-04-12

In solving simulation-based stochastic root-finding or optimization problems that involve rare events, such as in extreme quantile estimation, running crude Monte Carlo can be prohibitively inefficient. To address this issue, importance sampling employed to drive down the error a desirable level. However, selecting good sampler requires knowledge of solution problem at hand, which is goal begin with and thus forms circular challenge. We investigate use adaptive untie circularity. Our...

10.48550/arxiv.2102.10631 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Importance sampling (IS) is a powerful tool for rare-event estimation. However, in many settings, we need to estimate not only the performance expectation but also its gradient. In this paper, build bridge from IS estimation gradient We establish that, class of problems, an efficient sampler estimating probability underlying rare event gradients expectations over same set. show that both infinitesimal perturbation analysis and likelihood ratio estimators can be studied under proposed...

10.1109/wsc57314.2022.10015239 article EN 2018 Winter Simulation Conference (WSC) 2022-12-11

Existing batching methods are designed to cancel the variability parameter but not bias of estimators, and thus applied typically in setting unbiased estimation. We provide a scheme that out parameters estimators simultaneously, yielding asymptotically exact confidence intervals for biased estimation problems. apply our method finite difference estimators. extend multivariate case constructing regions. validate theory analyze effect number batches through numerical examples.

10.1109/wsc57314.2022.10015356 article EN 2018 Winter Simulation Conference (WSC) 2022-12-11

Distributionally robust optimization (DRO) is a worst-case framework for stochastic under uncertainty that has drawn fast-growing studies in recent years. When the underlying probability distribution unknown and observed from data, DRO suggests to compute within so-called set captures involved statistical uncertainty. In particular, with constructed as divergence neighborhood ball been shown provide tool constructing valid confidence intervals nonparametric functionals, bears duality...

10.48550/arxiv.2108.05908 preprint EN other-oa arXiv (Cornell University) 2021-01-01

While batching methods have been widely used in simulation and statistics, it is open regarding their higher-order coverage behaviors whether one variant better than the others this regard. We develop techniques to obtain errors for by building Edgeworth-type expansions on $t$-statistics. The coefficients these are intricate analytically, but we provide algorithms estimate of $n^{-1}$ error term via Monte Carlo simulation. insights effect number batches where demonstrate generally...

10.48550/arxiv.2111.06859 preprint EN other-oa arXiv (Cornell University) 2021-01-01
Coming Soon ...