Wenhan Cao

ORCID: 0009-0001-1367-4210
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Adaptive Dynamic Programming Control
  • Reinforcement Learning in Robotics
  • Target Tracking and Data Fusion in Sensor Networks
  • Advanced Control Systems Optimization
  • Autonomous Vehicle Technology and Safety
  • Traffic control and management
  • Stability and Control of Uncertain Systems
  • Stability and Controllability of Differential Equations
  • Fault Detection and Control Systems
  • Control Systems and Identification
  • Advanced Bandit Algorithms Research
  • Energy Load and Power Forecasting
  • Statistical Methods and Inference
  • Astronomical Observations and Instrumentation
  • Statistical and numerical algorithms
  • Advanced Statistical Methods and Models
  • Advanced Vision and Imaging
  • Cardiovascular Function and Risk Factors
  • Robotics and Sensor-Based Localization
  • Vehicle Dynamics and Control Systems
  • Robotic Path Planning Algorithms
  • Electric Power System Optimization
  • Advanced Measurement and Detection Methods
  • Electric and Hybrid Vehicle Technologies
  • Age of Information Optimization

Tsinghua University
2019-2024

University of California, San Diego
2022

National University of Singapore
2022

University of Science and Technology Beijing
2022

The convergence of policy gradient algorithms in reinforcement learning hinges on the optimization landscape underlying optimal control problem. Theoretical insights into these can often be acquired from analyzing those linear quadratic control. However, most existing literature only considers for static full-state or output feedback policies (controllers). We investigate more challenging case dynamic output-feedback regulation (abbreviated as dLQR), which is prevalent practice but has a...

10.1109/tac.2023.3275732 article EN IEEE Transactions on Automatic Control 2023-05-12

To learn a reward function that driver adheres to is of importance the human-like design autonomous driving systems. Inverse reinforcement learning (IRL) one recent advances can achieve this objective, but it often suffers from low efficiency generating optimal policy by (RL) each time when updating weights. This paper presents an accelerated IRL method approaching among randomly pre-sampled policies in designed sub-space instead finding through RL whole space. The corresponding trajectories...

10.1109/itsc.2019.8916952 article EN 2019-10-01

The accuracy of moving horizon estimation (MHE) significantly degrades under measurement outliers. Existing methods usually formulate combinatorial optimization problems to address this issue and are restricted linear systems ensure computational tractability. To overcome those limitations, paper proposes a generalized MHE (GMHE) approach that formulates as maximum posteriori problem extends the standard with loss function. proposed avoids high complexity existing has no restriction on...

10.23919/acc55779.2023.10156391 article EN 2022 American Control Conference (ACC) 2023-05-31

H-infinity filter has been widely applied in engineering field, but copping with bounded noise is still an open problem and difficult to solve. This paper considers the filtering for linear system process measurement noise. The first formulated as a zero-sum game where dynamic of estimation error non-affine respect gain A non-quadratic Hamilton-Jacobi-Isaacs (HJI) equation then derived by employing nonquadratic cost characterize noise, which extremely solve due its nonlinear properties....

10.1109/icsp48669.2020.9320936 article EN 2022 16th IEEE International Conference on Signal Processing (ICSP) 2020-12-06

Adaptive dynamic programming (ADP) for stochastic linear quadratic regulation (LQR) demands the precise computation of integrals during policy iteration (PI). In a fully model-free problem setting, this can only be approximated by state samples collected at discrete time points using computational methods such as canonical Euler-Maruyama method. Our research reveals critical phenomenon: sampling period significantly impact control performance. This is due to fact that errors introduced in...

10.48550/arxiv.2402.09575 preprint EN arXiv (Cornell University) 2024-02-14

Bayesian filtering serves as the mainstream framework of state estimation in dynamic systems. Its standard version utilizes total probability rule and Bayes' law alternatively, where how to define compute conditional is critical distribution inference. Previously, assumed be exactly known, which represents a measure occurrence one event, given second event. In this paper, we find that by adding an additional event stipulates inequality condition, can transform into special integration...

10.48550/arxiv.2404.00481 preprint EN arXiv (Cornell University) 2024-03-30

The convergence of policy gradient algorithms in reinforcement learning hinges on the optimization landscape underlying optimal control problem. Theoretical insights into these can often be acquired from analyzing those linear quadratic control. However, most existing literature only considers for static full-state or output feedback policies (controllers). We investigate more challenging case dynamic output-feedback regulation (abbreviated as dLQR), which is prevalent practice but has a...

10.1109/cdc51059.2022.9992503 article EN 2022 IEEE 61st Conference on Decision and Control (CDC) 2022-12-06

In autonomous driving, the ego vehicle and its surrounding traffic environments always have uncertainties like parameter structural errors, behavior randomness of road users, etc. Furthermore, environmental sensors are noisy or even biased. This problem can be formulated as a partially observable Markov decision process. Existing methods lack good representation historical information, making it very challenging to find an optimal policy. paper proposes belief state separated reinforcement...

10.1109/itsc48978.2021.9564576 article EN 2021-09-19

This paper proposes a primal-dual framework to learn stable estimator for linear constrained estimation problems leveraging the moving horizon approach. To avoid online computational burden in most existing methods, we parameterized function offline approximate primal estimate. Meanwhile, dual is trained check suboptimality of during execution time. Both and estimators are learned from data using supervised learning techniques, explicit sample size provided, which enables us guarantee...

10.1109/cdc51059.2022.9992814 article EN 2022 IEEE 61st Conference on Decision and Control (CDC) 2022-12-06

Estimating the state of a stochastic system is long-lasting issue in areas engineering and science. Existing methods either use approximations or yield high computation burden. In this paper, we propose reinforced optimal estimator (ROE), which an offline for general nonlinear non-Gaussian models. This method solves estimation problems offline, learned can be applied online efficiently. Firstly, demonstrate that minimum variance requires us to solve problem online, causes low efficiency To...

10.1016/j.ifacol.2021.11.201 article EN IFAC-PapersOnLine 2021-01-01

Multi-sensor calibration is crucial for mobile robots, especially in logistics and warehouse environments. While techniques exist individual sensors, industry-level of the multiple sensors on robots covering entire operational life cycle rarely explored. To bridge gap, we present a fast, accurate, scalable system supporting both manufacturing operation phases with diverse like odometers, IMUs, LiDARs, cameras. Specifically, propose an online yaw angle estimation approach IMU by fusing...

10.1109/cvci59596.2023.10397235 article EN 2021 5th CAA International Conference on Vehicular Control and Intelligence (CVCI) 2023-10-27

H-infinity filter has been widely applied in engineering field, but copping with bounded noise is still an open problem and difficult to solve. This paper considers the filtering for linear system process measurement noise. The first formulated as a zero-sum game where dynamic of estimation error non-affine respect gain A nonquadratic Hamilton-Jacobi-Isaacs (HJI) equation then derived by employing cost characterize noise, which extremely solve due its nonlinear properties. Next,...

10.48550/arxiv.2008.00674 preprint EN other-oa arXiv (Cornell University) 2020-01-01

The convergence of policy gradient algorithms hinges on the optimization landscape underlying optimal control problem. Theoretical insights into these can often be acquired from analyzing those linear quadratic control. However, most existing literature only considers for static full-state or output feedback policies (controllers). We investigate more challenging case dynamic output-feedback regulation (abbreviated as dLQR), which is prevalent in practice but has a rather complicated...

10.48550/arxiv.2201.09598 preprint EN other-oa arXiv (Cornell University) 2022-01-01

State estimation is critical to control systems, especially when the states cannot be directly measured. This paper presents an approximate optimal filter, which enables use policy iteration technique obtain steady-state gain in linear Gaussian time-invariant systems. design transforms filtering problem with minimum mean square error into problem, called Approximate Optimal Filtering (AOF) problem. The equivalence holds given certain conditions about initial state distributions and formats,...

10.48550/arxiv.2103.05505 preprint EN other-oa arXiv (Cornell University) 2021-01-01
Coming Soon ...