Not All Tasks Are Equally Difficult: Multi-Task Deep Reinforcement Learning with Dynamic Depth Routing

Adaptive routing
DOI: 10.1609/aaai.v38i11.29129 Publication Date: 2024-03-25T11:01:54Z
ABSTRACT
Multi-task reinforcement learning endeavors to accomplish a set of different tasks with single policy. To enhance data efficiency by sharing parameters across multiple tasks, common practice segments the network into distinct modules and trains routing recombine these task-specific policies. However, existing approaches employ fixed number for all neglecting that varying difficulties commonly require amounts knowledge. This work presents Dynamic Depth Routing (D2R) framework, which learns strategic skipping certain intermediate modules, thereby flexibly choosing numbers each task. Under this we further introduce ResRouting method address issue disparate paths between behavior target policies during off-policy training. In addition, design an automatic route-balancing mechanism encourage continued exploration unmastered without disturbing mastered ones. We conduct extensive experiments on various robotics manipulation in Meta-World benchmark, where D2R achieves state-of-the-art performance significantly improved efficiency.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (0)