Zhengyuan Zhou

ORCID: 0000-0002-0005-9411
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Bandit Algorithms Research
  • Reinforcement Learning in Robotics
  • Peroxisome Proliferator-Activated Receptors
  • Machine Learning and Algorithms
  • Sparse and Compressive Sensing Techniques
  • Stochastic Gradient Optimization Techniques
  • Diabetes, Cardiovascular Risks, and Lipoproteins
  • Auction Theory and Applications
  • Lipid metabolism and disorders
  • Advanced Queuing Theory Analysis
  • Advanced Wireless Network Optimization
  • Liver Disease Diagnosis and Treatment
  • Smart Grid Energy Management
  • Optimization and Search Problems
  • Guidance and Control Systems
  • Game Theory and Applications
  • Privacy-Preserving Technologies in Data
  • Cancer, Lipids, and Metabolism
  • Metabolism, Diabetes, and Cancer
  • Adipose Tissue and Metabolism
  • Cardiovascular Health and Risk Factors
  • Distributed Sensor Networks and Detection Algorithms
  • Nutrition, Genetics, and Disease
  • Cooperative Communication and Network Coding
  • Cancer, Hypoxia, and Metabolism

New York University
2020-2025

Xi'an Jiaotong University
2020-2025

State Key Laboratory of Electrical Insulation and Power Equipment
2023-2025

Gansu Provincial Center for Disease Control and Prevention
2022-2024

Community Health Center
2024

Fu Wai Hospital
2024

Chinese Academy of Medical Sciences & Peking Union Medical College
2015-2024

State Key Laboratory of Cardiovascular Disease
2024

Soochow University
2011-2024

Stanford University
2013-2023

Low-rank tensor recovery in the presence of sparse but arbitrary errors is an important problem with many practical applications. In this work, we propose a general framework that recovers low-rank tensors, which data can be deformed by some unknown transformations and corrupted errors. We give unified presentation surrogate-based formulations incorporate features rectification alignment simultaneously, establish worst-case error bounds recovered tensor. context, state-of-the-art methods...

10.1109/tpami.2019.2929043 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-07-16

As a result of digitization the economy, more and decision makers from wide range domains have gained ability to target products, services, information provision based on individual characteristics. Examples include selecting offers, prices, advertisements, or emails send consumers, choosing bid submit in contextual first-price auctions, determining which medication prescribe patient. The key enabling this is learn treatment policy historical observational data sample-efficient way, hence...

10.1287/opre.2022.2271 article EN Operations Research 2022-06-07

Inverse reinforcement learning (IRL) attempts to use demonstrations of “expert” decision making in a Markov process infer corresponding policy that shares the “structured, purposeful” qualities expert's actions. In this paper, we extend maximum causal entropy framework, notable paradigm IRL, infinite time horizon setting. We consider two formulations (maximum discounted and average entropy) appropriate for case show both result optimization programs can be reformulated as convex problems;...

10.1109/tac.2017.2775960 article EN IEEE Transactions on Automatic Control 2017-11-20

We consider a stochastic lost-sales inventory control system with lead time L over planning horizon T. Supply is uncertain, and it function of the order quantity (because random yield/capacity, etc.). aim to minimize T-period cost, problem that known be computationally intractable even under distributions demand supply. In this paper, we assume both supply are unknown develop efficient online learning algorithm. show our algorithm achieves regret (i.e., performance gap between cost an...

10.1287/mnsc.2022.02476 article EN Management Science 2024-03-04

A reach-avoid game is one in which an agent attempts to reach a predefined goal, while avoiding some adversarial circumstance induced by opposing or disturbance. Their analysis plays important role problems such as safe motion planning and obstacle avoidance, yet computing solutions often computationally expensive due the need consider inputs. In this work, we present open-loop formulation of two-player whereby players define their control inputs prior start game. We two games, each...

10.1109/cdc.2012.6426643 article EN 2012-12-01

A multiplayer reach-avoid game is a differential between an attacking team with NA attackers and defending ND defenders playing on compact domain obstacles. The aims to send M of the some target location, while prevent that by capturing or indefinitely delaying from reaching target. Although analysis this plays important role in many applications, optimal solution computationally intractable when NA>1 ND>1. In paper, we present two approaches for NA=ND=1 case determine pairwise outcomes,...

10.1109/tac.2016.2577619 article EN publisher-specific-oa IEEE Transactions on Automatic Control 2016-06-07

In this paper, we present an open-loop formulation of a single-pursuer-multiple-evader pursuit-evasion game. game, the pursuer attempts to minimize total capture time all evaders while evaders, as team, cooperate maximize time. The information pattern considered here is conservative towards evaders. One important advantage approach over geometrical in literature that it provides guaranteed survival evader team for initial conditions, without limitation must specific sequence. Another under...

10.1109/acc.2013.6580676 article EN American Control Conference 2013-06-01

We consider a multiplayer reach-avoid game with an equal number of attackers and defenders moving simple dynamics on two-dimensional domain possibly obstacles. The attacking team attempts to send as many certain target location possible quickly while the aim capture prevent from reaching its goal. analysis problems like this plays important role in collision avoidance, motion planning, aircraft control, among other applications. Computing optimal solutions for such games is intractable due...

10.1109/acc.2014.6859219 article EN American Control Conference 2014-06-01

We consider a multiplayer reach-avoid game played between N attackers and defenders moving with simple dynamics on general two-dimensional domain. The attempt to win the by sending at least M of them (1 ≤ N) target location while try prevent from doing so capturing them. analysis this plays an important role in collision avoidance, motion planning, aircraft control, among other applications involving cooperative agents. high dimensionality makes computing optimal solution for either side...

10.1109/cdc.2014.7039758 article EN 2014-12-01

Abstract Background Trimethylamine N‐oxide (TMAO) is a gut‐derived atherogenic metabolite. However, the role of TMAO and its precursors in development stroke remains unclear. We aimed to examine associations between metabolites biosynthesis risk. Methods A nested case‐control study was performed community‐based cohort (2013–2018, n = 16,113). included 412 identified cases controls matched by age sex. Plasma carnitine, choline, betaine, trimethyl lysine (TML), were measured ultrahigh...

10.1111/joim.13572 article EN Journal of Internal Medicine 2022-10-06

In “Optimal No-Regret Learning in Repeated First-Price Auctions,” Y. Han, W. Tsachy, and Z. Zhou study online learning repeated first-price auctions where a bidder, only observing the winning bid at end of each auction, learns to adaptively maximize her cumulative payoff. To achieve this goal, bidder faces censored feedback: If she wins bid, then is not able observe highest other bidders, which we assume i.i.d. drawn from an unknown distribution. paper, they develop first algorithm that...

10.1287/opre.2020.0282 article EN Operations Research 2024-07-08

Doubly Optimal No-Regret Online Learning in Strongly Monotone Games with Bandit Feedback Curious about how players can learn and adapt unknown games without knowing the game’s dynamics? In “Doubly Feedback,” Ba, Lin, Zhang, Zhou present a novel bandit learning algorithm for no-regret where each player only observes its reward determined by all players’ current joint action, not gradient. Focusing on smooth strongly monotone games, they introduce using self-concordant barrier functions. This...

10.1287/opre.2021.0445 article EN Operations Research 2025-01-03

Objective This study aimed to enhance the prevention and control of pulmonary tuberculosis (PTB) provide more effective accurate methods in Changshu City. Methods The PTB patients’ information came from China Information System for Disease Control Prevention (CISDCP). demographic data city towns Suzhou Statistical Yearbook LandScan platform. ArcGIS was used global spatial autocorrelation analysis local analysis. Univariate logistic regression multivariate were analyze influencing factors...

10.1371/journal.pone.0317269 article EN cc-by PLoS ONE 2025-01-16

Designing learning agents that explore efficiently in a complex environment has been widely recognized as fundamental challenge reinforcement learning. While number of works have demonstrated the effectiveness techniques based on randomized value functions single agent, it remains unclear, from theoretical point view, whether injecting randomization can help society {\it concurently} an environment. The results %that we established this work tender affirmative answer to question. We adapt...

10.48550/arxiv.2501.13394 preprint EN arXiv (Cornell University) 2025-01-23
Coming Soon ...