Task Aware Dreamer for Task Generalization in Reinforcement Learning

Relevance Adaptability
DOI: 10.48550/arxiv.2303.05092 Publication Date: 2023-01-01
ABSTRACT
A long-standing goal of reinforcement learning is to acquire agents that can learn on training tasks and generalize well unseen may share a similar dynamic but with different reward functions. The ability across important as it determines an agent's adaptability real-world scenarios where mechanisms might vary. In this work, we first show general world model utilize structures in these help train more generalizable agents. Extending models into the task generalization setting, introduce novel method named Task Aware Dreamer (TAD), which integrates reward-informed features identify consistent latent characteristics tasks. Within TAD, compute variational lower bound sample data log-likelihood, introduces new term designed differentiate using their states, optimization objective our models. To demonstrate advantages policy metric called Distribution Relevance (TDR) quantitatively measures relevance For exhibiting high TDR, i.e., differ significantly, illustrate Markovian policies struggle distinguish them, thus necessary TAD. Extensive experiments both image-based state-based TAD significantly improve performance handling simultaneously, especially for those display strong
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....