Replay across Experiments: A Natural Extension of Off-Policy RL
Hyperparameter
Robustness
DOI:
10.48550/arxiv.2311.15951
Publication Date:
2023-01-01
AUTHORS (12)
ABSTRACT
Replaying data is a principal mechanism underlying the stability and efficiency of off-policy reinforcement learning (RL). We present an effective yet simple framework to extend use replays across multiple experiments, minimally adapting RL workflow for sizeable improvements in controller performance research iteration times. At its core, Replay Across Experiments (RaE) involves reusing experience from previous experiments improve exploration bootstrap while reducing required changes minimum comparison prior work. empirically show benefits number algorithms challenging control domains spanning both locomotion manipulation, including hard tasks egocentric vision. Through comprehensive ablations, we demonstrate robustness quality amount available various hyperparameter choices. Finally, discuss how our approach can be applied more broadly life cycles increase resilience by reloading random seeds or variations.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....