NFDI4DS | UHH-SEMS - Publication Details

Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation Skill

OPENALEX - Publications

Wenzhe Cai Siyuan Huang Guangran Cheng Yuxing Long Peng Gao and 2 more

10.1109/icra57147.2024.10610499 article EN 2024-05-13

Multi-Task Reinforcement Learning With Attention-Based Mixture of Experts

OPENALEX - Publications

Guangran Cheng Lu Dong Wenzhe Cai Changyin Sun

Multi-task learning is an important problem in reinforcement learning. Training multiple tasks together brings benefits from the shared useful information across different and often achieves higher performance compared to single-task However, it remains unclear how parameters network should be reused tasks. Instead of naively sharing all tasks, we propose attention-based mixture experts multi-task approach learn a compositional policy for each task. The expert networks task-specific skills...

10.1109/lra.2023.3271445 article EN IEEE Robotics and Automation Letters 2023-05-12

Wetland information extraction and dynamic change monitoring with joint multifeature optimization and time-series imaging

OPENALEX - Publications

Kaixuan Zhang Chuang Li Guangran Cheng Ruishan Zhao

10.1117/12.3059466 article EN 2025-04-15

Multi-objective deep reinforcement learning for crowd-aware robot navigation with dynamic human preference

OPENALEX - Publications

Guangran Cheng Yuanda Wang Lu Dong Wenzhe Cai Changyin Sun

10.1007/s00521-023-08385-4 article EN Neural Computing and Applications 2023-06-14

XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library

OPENALEX - Publications

Wenzhang Liu Wenzhe Cai Kun Jiang Guangran Cheng Yuanda Wang and 5 more

In this paper, we present XuanCe, a comprehensive and unified deep reinforcement learning (DRL) library designed to be compatible with PyTorch, TensorFlow, MindSpore. XuanCe offers wide range of functionalities, including over 40 classical DRL multi-agent algorithms, the flexibility easily incorporate new algorithms environments. It is versatile that supports CPU, GPU, Ascend, can executed on various operating systems such as Ubuntu, Windows, MacOS, EulerOS. Extensive benchmarks conducted...

10.48550/arxiv.2312.16248 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Bridging Zero-shot Object Navigation and Foundation Models through Pixel-Guided Navigation Skill

OPENALEX - Publications

Wenzhe Cai Siyuan Huang Guangran Cheng Yuxing Long Peng Gao and 2 more

Zero-shot object navigation is a challenging task for home-assistance robots. This emphasizes visual grounding, commonsense inference and locomotion abilities, where the first two are inherent in foundation models. But part, most works still depend on map-based planning approaches. The gap between RGB space map makes it difficult to directly transfer knowledge from models tasks. In this work, we propose Pixel-guided Navigation skill (PixNav), which bridges embodied task. It straightforward...

10.48550/arxiv.2309.10309 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Empowering Large Language Models on Robotic Manipulation with Affordance Prompting

OPENALEX - Publications

Guangran Cheng Chuheng Zhang Wenzhe Cai Zhao Li Changyin Sun and 1 more

While large language models (LLMs) are successful in completing various processing tasks, they easily fail to interact with the physical world by generating control sequences properly. We find that main reason is LLMs not grounded world. Existing LLM-based approaches circumvent this problem relying on additional pre-defined skills or pre-trained sub-policies, making it hard adapt new tasks. In contrast, we aim address and explore possibility prompt accomplish a series of robotic manipulation...

10.48550/arxiv.2404.11027 preprint EN arXiv (Cornell University) 2024-04-16

Robust Navigation with Cross-Modal Fusion and Knowledge Transfer

OPENALEX - Publications

Wenzhe Cai Guangran Cheng Lingyue Kong Lu Dong Changyin Sun

Recently, learning-based approaches show promising results in navigation tasks. However, the poor generalization capability and simulation-reality gap prevent a wide range of applications. We consider problem improving mobile robots achieving sim-to-real transfer for skills. To that end, we propose cross-modal fusion method knowledge framework better generalization. This is realized by teacher-student distillation architecture. The teacher learns discriminative representation near-perfect...

10.1109/icra48891.2023.10161405 article EN 2023-05-29

Robust Navigation with Cross-Modal Fusion and Knowledge Transfer

OPENALEX - Publications

Wenzhe Cai Guangran Cheng Lingyue Kong Lu Dong Changyin Sun

Recently, learning-based approaches show promising results in navigation tasks. However, the poor generalization capability and simulation-reality gap prevent a wide range of applications. We consider problem improving mobile robots achieving sim-to-real transfer for skills. To that end, we propose cross-modal fusion method knowledge framework better generalization. This is realized by teacher-student distillation architecture. The teacher learns discriminative representation near-perfect...

10.48550/arxiv.2309.13266 preprint EN other-oa arXiv (Cornell University) 2023-01-01

DGMem: Learning Visual Navigation Policy without Any Labels by Dynamic Graph Memory

OPENALEX - Publications

Wenzhe Cai Teng Wang Guangran Cheng Lele Xu Changyin Sun

In recent years, learning-based approaches have demonstrated significant promise in addressing intricate navigation tasks. Traditional methods for training deep neural network policies rely on meticulously designed reward functions or extensive teleoperation datasets as demonstrations. However, the former is often confined to simulated environments, and latter demands substantial human labor, making it a time-consuming process. Our vision robots autonomously learn skills adapt their...

10.48550/arxiv.2311.18473 preprint EN other-oa arXiv (Cornell University) 2023-01-01