NFDI4DS | UHH-SEMS - Publication Details

Policy Distillation

FOS: Computer and information sciences Computer Science - Machine Learning 0209 industrial biotechnology 02 engineering and technology Machine Learning (cs.LG)

DOI: 10.48550/arxiv.1511.06295 Publication Date: 2015-01-01

Abstract Supplemental Material References Cited by

AUTHORS (9)

Rusu, Andrei A.

Colmenarejo, Serg...

Gulcehre, Caglar

Desjardins, Guill...

Kirkpatrick, James

Pascanu, Razvan

Mnih, Volodymyr

Kavukcuoglu, Koray

Hadsell, Raia

ABSTRACT

Policies for complex visual tasks have been successfully learned with deep reinforcement learning, using an approach called deep Q-networks (DQN), but relatively large (task-specific) networks and extensive training are needed to achieve good performance. In this work, we present a novel method called policy distillation that can be used to extract the policy of a reinforcement learning agent and train a new network that performs at the expert level while being dramatically smaller and more efficient. Furthermore, the same method can be used to consolidate multiple task-specific policies into a single policy. We demonstrate these claims using the Atari domain and show that the multi-task distilled agent outperforms the single-task teachers as well as a jointly-trained DQN agent.<br/>Submitted to ICLR 2016<br/>

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products

PlumX Metrics

Policy Distillation

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....