Effective Diversity in Population Based Reinforcement Learning
FOS: Computer and information sciences
Computer Science - Machine Learning
Statistics - Machine Learning
0202 electrical engineering, electronic engineering, information engineering
Machine Learning (stat.ML)
02 engineering and technology
01 natural sciences
0105 earth and related environmental sciences
Machine Learning (cs.LG)
DOI:
10.48550/arxiv.2002.00632
Publication Date:
2020-01-01
AUTHORS (4)
ABSTRACT
Exploration is a key problem in reinforcement learning, since agents can only learn from data they acquire the environment. With that mind, maintaining population of an attractive method, as it allows be collected with diverse set behaviors. This behavioral diversity often boosted via multi-objective loss functions. However, those approaches typically leverage mean field updates based on pairwise distances, which makes them susceptible to cycling behaviors and increased redundancy. In addition, explicitly boosting has detrimental impact optimizing already fruitful for rewards. As such, reward-diversity trade off relies heuristics. Finally, such methods require representations, handcrafted domain specific. this paper, we introduce approach optimize all members simultaneously. Rather than using distance, measure volume entire manifold, defined by task-agnostic embeddings. our algorithm Diversity Determinants (DvD), adapts degree during training online learning techniques. We both evolutionary gradient-based instantiations DvD show effectively improve exploration without reducing performance when better not required.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....