Lei Han

ORCID: 0000-0003-1404-2415
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Reinforcement Learning in Robotics
  • Artificial Intelligence in Games
  • Robotic Locomotion and Control
  • Advanced Bandit Algorithms Research
  • Human Pose and Action Recognition
  • Distributed Sensor Networks and Detection Algorithms
  • Speech and Audio Processing
  • Speech and dialogue systems
  • Speech Recognition and Synthesis
  • Evolutionary Algorithms and Applications
  • Modular Robots and Swarm Intelligence
  • Adaptive Dynamic Programming Control
  • Statistical Methods and Inference
  • Human Motion and Animation
  • Robotic Path Planning Algorithms
  • Domain Adaptation and Few-Shot Learning
  • Digital Games and Media
  • Video Surveillance and Tracking Methods
  • Auction Theory and Applications
  • Quality Function Deployment in Product Design
  • Experimental Behavioral Economics Studies
  • Model Reduction and Neural Networks
  • Adversarial Robustness in Machine Learning
  • Face and Expression Recognition
  • Machine Learning and Data Classification

Tencent (China)
2018-2025

Yunnan Agricultural University
2025

Beijing Urban Systems Engineering Research Center
2020

Most existing deep reinforcement learning (DRL) frameworks consider either discrete action space or continuous solely. Motivated by applications in computer games, we the scenario with discrete-continuous hybrid space. To handle space, previous works approximate discretization, relax it into a set. In this paper, propose parametrized Q-network (P- DQN) framework for without approximation relaxation. Our algorithm combines spirits of both DQN (dealing space) and DDPG seamlessly integrating...

10.48550/arxiv.1810.06394 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Humans can perceive our complex world through multi-sensory fusion. Under limited visual conditions, people sense a variety of tactile signals to identify objects accurately and rapidly. However, replicating this unique capability in robots remains significant challenge. Here, we present new form ultralight multifunctional nano-layered carbon aerogel sensor that provides pressure, temperature, material recognition 3D location capabilities, which is combined with multimodal supervised...

10.1007/s40820-023-01216-0 article EN cc-by Nano-Micro Letters 2023-11-09

Recent advances in learning reusable motion priors have demonstrated their effectiveness generating naturalistic behaviors. In this paper, we propose a new framework paradigm for controlling physics-based characters with improved quality and diversity over existing methods. The proposed method uses reinforcement (RL) to initially track imitate life-like movements from unstructured clips using the discrete information bottleneck, as adopted Vector Quantized Variational AutoEncoder (VQ-VAE)....

10.1145/3618397 article EN ACM Transactions on Graphics 2023-12-05

Starcraft II (SC2) is widely considered as the most challenging Real Time Strategy (RTS) game. The underlying challenges include a large observation space, huge (continuous and infinite) action partial observations, simultaneous move for all players, long horizon delayed rewards local decisions. To push frontier of AI research, Deepmind Blizzard jointly developed StarCraft Learning Environment (SC2LE) testbench complex decision making systems. SC2LE provides few mini games such MoveToBeacon,...

10.48550/arxiv.1809.07193 preprint EN other-oa arXiv (Cornell University) 2018-01-01

<title>Abstract</title> Embodied artificial intelligence (EAI) integrates advanced Al models into physical entities for real-world interaction. The emergence of foundation as the "brain" EAI agents high-level task planning has shown promising results. However, deployment these in environments presents significant safety challenges. For instance, a housekeeping robot lacking sufficient risk awareness might place metal container microwave, potentially causing fire. To address critical...

10.21203/rs.3.rs-5540665/v1 preprint EN cc-by Research Square (Research Square) 2025-02-03

In this paper, we present a general learning framework for controlling quadruped robot that can mimic the behavior of real animals and traverse challenging terrains. Our method consists two steps: an imitation step to learn from motions animals, terrain adaptation enable generalization unseen We capture Labrador on various terrains facilitate adaptive locomotion. experiments demonstrate our policy produce natural-looking behavior. deployed <tex xmlns:mml="http://www.w3.org/1998/Math/MathML"...

10.1109/iros55552.2023.10342271 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2023-10-01

StarCraft, one of the most difficult esport games with long-standing history professional tournaments, has attracted generations players and fans, also, intense attentions in artificial intelligence research. Recently, Google's DeepMind announced AlphaStar, a grandmaster level AI StarCraft II that can play humans using comparable action space operations. In this paper, we introduce new agent, named TStarBot-X, is trained under orders less computations competitively expert human players....

10.48550/arxiv.2011.13729 preprint EN cc-by arXiv (Cornell University) 2020-01-01

<title>Abstract</title> Summarizing knowledge from animals and human beings inspires robotic innovations. In this work, we propose a framework for driving legged robots act like real with lifelike agility strategy in complex environments. Inspired by large pre-trained models witnessed impressive performance language image understanding, introduce the power of advanced deep generative to produce motor control signals stimulating animals. Unlike conventional controllers end-to-end RL methods...

10.21203/rs.3.rs-3309878/v1 preprint EN cc-by Research Square (Research Square) 2023-09-29

Efficient exploration remains a challenging problem in reinforcement learning, especially for tasks where extrinsic rewards from environments are sparse or even totally disregarded. Significant advances based on intrinsic motivation show promising results simple but often get stuck with multimodal and stochastic dynamics. In this work, we propose variational dynamic model the conditional inference to multimodality stochasticity. We consider environmental state-action transition as generative...

10.1109/tnnls.2021.3129160 article EN IEEE Transactions on Neural Networks and Learning Systems 2021-12-01

Solving multi-goal reinforcement learning (RL) problems with sparse rewards is generally challenging. Existing approaches have utilized goal relabeling on collected experiences to alleviate issues raised from rewards. However, these methods are still limited in efficiency and cannot make full use of experiences. In this paper, we propose Model-based Hindsight Experience Replay (MHER), which exploits more efficiently by leveraging environmental dynamics generate virtual achieved goals....

10.48550/arxiv.2107.00306 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Competitive Self-Play (CSP) based Multi-Agent Reinforcement Learning (MARL) has shown phenomenal breakthroughs recently. Strong AIs are achieved for several benchmarks, including Dota 2, Glory of Kings, Quake III, StarCraft II, to name a few. Despite the success, MARL training is extremely data thirsty, requiring typically billions (if not trillions of) frames be seen from environment during in order learning high performance agent. This poses non-trivial difficulties researchers or...

10.48550/arxiv.2011.12895 preprint EN cc-by arXiv (Cornell University) 2020-01-01

The model averaging problem is to average multiple models achieve a prediction accuracy not much worse than that of the best single in terms mean-squared error. It known if are misspecified, superior selection. Specifically, let <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$n$ </tex-math></inline-formula> be sample size, then worst case regret former decays at rate notation="LaTeX">$O(1/n)$ , whereas...

10.1109/tit.2018.2805903 article EN IEEE Transactions on Information Theory 2018-02-14

Neural network language model (NNLM) is an essential component of industrial ASR systems. One important challenge training NNLM to leverage between scaling the learning process and handling big data. Conventional approaches such as block momentum provides a blockwise update filtering (BMUF) achieves almost linear speedups with no performance degradation for speech recognition. However, it needs calculate average from all computing nodes (e.g., GPUs) when number large, suffers severe...

10.1109/icassp40776.2020.9053637 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2020-04-09

In this article, we study the problem of guaranteed display ads (GDAs) allocation, which requires proactively allocate to different impressions fulfill their impression demands indicated in contracts. Existing methods for either assume that are static or solely consider a specific ad's benefits. Thus, it is hard generalize industrial production scenario where dynamical and large-scale, overall allocation optimality all considered GDAs required. To bridge gap, formulate as sequential...

10.1109/tnnls.2021.3070484 article EN IEEE Transactions on Neural Networks and Learning Systems 2021-05-17

One principled approach for provably efficient exploration is incorporating the upper confidence bound (UCB) into value function as a bonus. However, UCB specified to deal with linear and tabular settings incompatible Deep Reinforcement Learning (DRL). In this paper, we propose method DRL through Optimistic Bootstrapping Backward Induction (OB2I). OB2I constructs general-purpose UCB-bonus non-parametric bootstrap in DRL. The estimates epistemic uncertainty of state-action pairs optimistic...

10.48550/arxiv.2105.06022 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Learning-based methods have improved locomotion skills of quadruped robots through deep reinforcement learning. However, the sim-to-real gap and low sample efficiency still limit skill transfer. To address this issue, we propose an efficient model-based learning framework that combines a world model with policy network. We train differentiable to predict future states use it directly supervise Variational Autoencoder (VAE)-based network imitate real animal behaviors. This significantly...

10.48550/arxiv.2403.01962 preprint EN arXiv (Cornell University) 2024-03-04
Coming Soon ...