Laixi Shi

ORCID: 0000-0003-4038-8620
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Reinforcement Learning in Robotics
  • Advanced Bandit Algorithms Research
  • Indoor and Outdoor Localization Technologies
  • Image and Signal Denoising Methods
  • Data Stream Mining Techniques
  • Domain Adaptation and Few-Shot Learning
  • Sparse and Compressive Sensing Techniques
  • Speech and Audio Processing
  • Adaptive Dynamic Programming Control
  • Video Surveillance and Tracking Methods
  • Machine Learning and ELM
  • Adversarial Robustness in Machine Learning
  • Gait Recognition and Analysis
  • Traffic control and management
  • Radar Systems and Signal Processing
  • Image and Object Detection Techniques
  • Computability, Logic, AI Algorithms
  • Evolutionary Algorithms and Applications
  • Optimization and Search Problems
  • Distributed Control Multi-Agent Systems
  • Image Processing Techniques and Applications
  • Tensor decomposition and applications
  • Anomaly Detection Techniques and Applications
  • Robotics and Sensor-Based Localization
  • Hand Gesture Recognition Systems

California Institute of Technology
2024

Carnegie Mellon University
2019-2023

Mitsubishi Electric (United States)
2021

Tsinghua University
2018

This paper is concerned with offline reinforcement learning (RL), which learns using precollected data without further exploration. Effective RL would be able to accommodate distribution shift and limited coverage. However, prior results either suffer from suboptimal sample complexities or incur high burn-in cost reach optimality, thus posing an impediment efficient in sample-starved applications. We demonstrate that the model-based (or "plug-in") approach achieves minimax-optimal complexity...

10.1214/23-aos2342 article EN The Annals of Statistics 2024-02-01

In this paper, we propose a micro hand gesture recognition system and methods using ultrasonic active sensing. This uses dynamic gestures for to achieve human-computer interaction (HCI). The implemented system, called hand-ultrasonic (HUG), consists of sensing, pulsed radar signal processing, time-sequence pattern by machine learning. We adopt lower frequency (300 kHz) sensing obtain high resolution range-Doppler image features. Using quality sequential features, state-transition-based...

10.1109/access.2018.2868268 article EN cc-by-nc-nd IEEE Access 2018-01-01

Multi-channel sparse blind deconvolution, or convolutional coding, refers to the problem of learning an unknown filter by observing its circulant convolutions with multiple input signals that are sparse. This finds numerous applications in signal processing, computer vision, and inverse problems. However, it is challenging learn efficiently due bilinear structure observations respect inputs, as well sparsity constraint. In this paper, we propose a novel approach based on nonconvex...

10.1109/tit.2021.3075148 article EN publisher-specific-oa IEEE Transactions on Information Theory 2021-04-22

Structural vibration-based human sensing provides an alternative approach for device-free monitoring, which is used healthcare, space and energy usage management, etc. Prior work on this mainly focused one person walking scenarios, limits their widespread application. The challenge with multiple walkers that the observed vibration response a mixture of each walker's footstep-induced response, it difficult to identify 1) how many concurrent are present, 2) timing footstep impacts floor. As...

10.1145/3360773.3360887 article EN 2019-10-23

Abstract Achieving sample efficiency in online episodic reinforcement learning (RL) requires optimally balancing exploration and exploitation. When it comes to a finite-horizon Markov decision process with $S$ states, $A$ actions horizon length $H$, substantial progress has been achieved toward characterizing the minimax-optimal regret, which scales on order of $\sqrt{H^2SAT}$ (modulo log factors) $T$ total number samples. While several competing solution paradigms have proposed minimize...

10.1093/imaiai/iaac034 article EN cc-by Information and Inference A Journal of the IMA 2022-12-17

Offline or batch reinforcement learning seeks to learn a near-optimal policy using history data without active exploration of the environment. To counter insufficient coverage and sample scarcity many offline datasets, principle pessimism has been recently introduced mitigate high bias estimated values. While pessimistic variants model-based algorithms (e.g., value iteration with lower confidence bounds) have theoretically investigated, their model-free counterparts -- which do not require...

10.48550/arxiv.2202.13890 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Multi-channel sparse blind deconvolution refers to the problem of learning an unknown filter by observing its circulant convolutions with multiple input signals that are sparse. It is challenging learn efficiently due bilinear structure observations respect and inputs, leading global ambiguities identification. We propose a novel approach based on nonconvex optimization over sphere manifold minimizing smooth surrogate sparsity-promoting loss function. demonstrated gradient descent random...

10.1109/icassp40776.2020.9054356 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

This paper is concerned with offline reinforcement learning (RL), which learns using pre-collected data without further exploration. Effective RL would be able to accommodate distribution shift and limited coverage. However, prior algorithms or analyses either suffer from suboptimal sample complexities incur high burn-in cost reach optimality, thus posing an impediment efficient in sample-starved applications. We demonstrate that the model-based (or "plug-in") approach achieves...

10.48550/arxiv.2204.05275 preprint EN cc-by arXiv (Cornell University) 2022-01-01

This paper concerns the central issues of model robustness and sample efficiency in offline reinforcement learning (RL), which aims to learn perform decision making from history data without active exploration. Due uncertainties variabilities environment, it is critical a robust policy -- with as few samples possible that performs well even when deployed environment deviates nominal one used collect dataset. We consider distributionally formulation RL, focusing on tabular Markov processes an...

10.48550/arxiv.2208.05767 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Curriculum Reinforcement Learning (CRL) aims to create a sequence of tasks, starting from easy ones and gradually learning towards difficult tasks. In this work, we focus on the idea framing CRL as interpolations between source (auxiliary) target task distribution. Although existing studies have shown great potential idea, it remains unclear how formally quantify generate movement distributions. Inspired by insights gradual domain adaptation in semi-supervised learning, natural curriculum...

10.48550/arxiv.2210.10195 preprint EN cc-by-sa arXiv (Cornell University) 2022-01-01

Offline reinforcement learning (RL), which seeks to learn an optimal policy using offline data, has garnered significant interest due its potential in critical applications where online data collection is infeasible or expensive. This work explores the benefit of federated for RL, aiming at collaboratively leveraging datasets multiple agents. Focusing on finite-horizon episodic tabular Markov decision processes (MDPs), we design FedLCB-Q, a variant popular model-free Q-learning algorithm...

10.48550/arxiv.2402.05876 preprint EN arXiv (Cornell University) 2024-02-08

In offline reinforcement learning (RL), the absence of active exploration calls for attention on model robustness to tackle sim-to-real gap, where discrepancy between simulated and deployed environments can significantly undermine performance learned policy. To endow policy with in a sample-efficient manner presence high-dimensional state-action space, this paper considers sample complexity distributionally robust linear Markov decision processes (MDPs) an uncertainty set characterized by...

10.48550/arxiv.2403.12946 preprint EN arXiv (Cornell University) 2024-03-19

To overcome the sim-to-real gap in reinforcement learning (RL), learned policies must maintain robustness against environmental uncertainties. While robust RL has been widely studied single-agent regimes, multi-agent environments, problem remains understudied -- despite fact that problems posed by uncertainties are often exacerbated strategic interactions. This work focuses on distributionally Markov games (RMGs), a variant of standard games, wherein each agent aims to learn policy maximizes...

10.48550/arxiv.2404.18909 preprint EN arXiv (Cornell University) 2024-04-29

Safe reinforcement learning (RL) is crucial for deploying RL agents in real-world applications, as it aims to maximize long-term rewards while satisfying safety constraints. However, safe often suffers from sample inefficiency, requiring extensive interactions with the environment learn a policy. We propose Efficient Policy Optimization (ESPO), novel approach that enhances efficiency of through manipulation. ESPO employs an optimization framework three modes: maximizing rewards, minimizing...

10.48550/arxiv.2405.20860 preprint EN arXiv (Cornell University) 2024-05-31

A significant roadblock to the development of principled multi-agent reinforcement learning is fact that desired solution concepts like Nash equilibria may be intractable compute. To overcome this obstacle, we take inspiration from behavioral economics and show -- by imbuing agents with important features human decision-making risk aversion bounded rationality a class risk-averse quantal response (RQE) become tractable compute in all $n$-player matrix finite-horizon Markov games. In...

10.48550/arxiv.2406.14156 preprint EN arXiv (Cornell University) 2024-06-20

We study the problem of Distributionally Robust Constrained RL (DRC-RL), where goal is to maximize expected reward subject environmental distribution shifts and constraints. This setting captures situations training testing environments differ, policies must satisfy constraints motivated by safety or limited budgets. Despite significant progress toward algorithm design for separate problems distributionally robust constrained RL, there do not yet exist algorithms with end-to-end convergence...

10.48550/arxiv.2406.15788 preprint EN arXiv (Cornell University) 2024-06-22

Offline model-based reinforcement learning (MBRL) enhances data efficiency by utilizing pre-collected datasets to learn models and policies, especially in scenarios where exploration is costly or infeasible. Nevertheless, its performance often suffers from the objective mismatch between model policy learning, resulting inferior despite accurate predictions. This paper first identifies primary source of this comes underlying confounders present offline for MBRL. Subsequently, we introduce...

10.48550/arxiv.2407.10967 preprint EN arXiv (Cornell University) 2024-07-15

Standard multi-agent reinforcement learning (MARL) algorithms are vulnerable to sim-to-real gaps. To address this, distributionally robust Markov games (RMGs) have been proposed enhance robustness in MARL by optimizing the worst-case performance when game dynamics shift within a prescribed uncertainty set. Solving RMGs remains under-explored, from problem formulation development of sample-efficient algorithms. A notorious yet open challenge is if can escape curse multiagency, where sample...

10.48550/arxiv.2409.20067 preprint EN arXiv (Cornell University) 2024-09-30

Online Reinforcement learning (RL) typically requires high-stakes online interaction data to learn a policy for target task. This prompts interest in leveraging historical improve sample efficiency. The may come from outdated or related source environments with different dynamics. It remains unclear how effectively use such the task provably enhance and To address this, we propose hybrid transfer RL (HTRL) setting, where an agent learns environment while accessing offline shifted We show...

10.48550/arxiv.2411.03810 preprint EN arXiv (Cornell University) 2024-11-06

Reinforcement Learning (RL) algorithms are known to suffer from the curse of dimensionality, which refers fact that large-scale problems often lead exponentially high sample complexity. A common solution is use deep neural networks for function approximation; however, such approaches typically lack theoretical guarantees. To provably address we observe many real-world exhibit task-specific model structures that, when properly leveraged, can improve efficiency RL. Building on this insight,...

10.48550/arxiv.2411.07591 preprint EN arXiv (Cornell University) 2024-11-12

Multi-channel sparse blind deconvolution, or convolutional coding, refers to the problem of learning an unknown filter by observing its circulant convolutions with multiple input signals that are sparse. This finds numerous applications in signal processing, computer vision, and inverse problems. However, it is challenging learn efficiently due bilinear structure observations respect inputs, as well sparsity constraint. In this paper, we propose a novel approach based on nonconvex...

10.48550/arxiv.1911.11167 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Floor vibration-based sensing provides an alternative approach for multiple occupant localization, enabling various smart building applications such as elderly care. Prior work mainly focuses on detecting onsets of individual footstep from overlapping signals localization. However, the error rate is higher than that single footsteps. In this work, we present a data quality-informed time-sequence accurate multi-people The intuition when overlap, part signal has lower SNR, which can be...

10.1145/3376897.3379162 article EN 2020-02-25

The signal quality of real-world infrastructure sensing systems varies significantly over the deployment environment and hardware implementation. Quantify allow further optimization configuration to enhance application performance. In our previous work, we choose a standard excitation apply it repeatedly assess quality, which is labor-intensive difficult scale. this 1) utilize mobility people scale assessment, 2) use mobile system identify subset human-induced signals that can be used as excitation.

10.1145/3376897.3379165 article EN 2020-02-25
Coming Soon ...