NFDI4DS | UHH-SEMS - Publication Details

VIME: Variational Information Maximizing Exploration

FOS: Computer and information sciences Computer Science - Machine Learning Computer Science - Robotics Artificial Intelligence (cs.AI) Computer Science - Artificial Intelligence Statistics - Machine Learning 0202 electrical engineering, electronic engineering, information engineering Machine Learning (stat.ML) 02 engineering and technology Robotics (cs.RO) Machine Learning (cs.LG)

DOI: 10.48550/arxiv.1605.09674 Publication Date: 2016-01-01

Abstract Supplemental Material References Cited by

AUTHORS (6)

Rein Houthooft

Xi Chen

Yan Duan

John Schulman

Filip De Turck

Pieter Abbeel

ABSTRACT

Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees the setting of discrete state action spaces, these cannot be applied high-dimensional deep RL scenarios. As such, most contemporary relies on simple heuristics such as epsilon-greedy or adding Gaussian noise to controls. This paper introduces Variational Information Maximizing Exploration (VIME), an strategy based maximization information gain about agent's belief environment dynamics. We propose practical implementation, using variational inference Bayesian neural networks which efficiently handles continuous spaces. VIME modifies MDP reward function, can several different underlying algorithms. demonstrate that achieves significantly better performance compared heuristic across variety control tasks algorithms, including very sparse rewards.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

VIME: Variational Information Maximizing Exploration

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....