Latent Space Policies for Hierarchical Reinforcement Learning
FOS: Computer and information sciences
Computer Science - Machine Learning
0209 industrial biotechnology
Artificial Intelligence (cs.AI)
Computer Science - Artificial Intelligence
Statistics - Machine Learning
Machine Learning (stat.ML)
02 engineering and technology
Machine Learning (cs.LG)
DOI:
10.48550/arxiv.1804.02808
Publication Date:
2018-01-01
AUTHORS (4)
ABSTRACT
We address the problem of learning hierarchical deep neural network policies for reinforcement learning. In contrast to methods that explicitly restrict or cripple lower layers a hierarchy force them use higher-level modulating signals, each layer in our framework is trained directly solve task, but acquires range diverse strategies via maximum entropy objective. Each also augmented with latent random variables, which are sampled from prior distribution during training layer. The objective causes these variables be incorporated into layer's policy, and higher level can control behavior through this space. Furthermore, by constraining mapping actions invertible, retain full expressivity: neither nor constrained their behavior. Our experimental evaluation demonstrates we improve on performance single-layer standard benchmark tasks simply adding additional layers, method more complex sparse-reward top high-entropy skills optimized simple low-level objectives.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....