NFDI4DS | UHH-SEMS - Publication Details

An Actor-Critic Algorithm for Sequence Prediction

Leverage (statistics) Ground truth Sequence (biology) Deep Neural Networks

DOI: 10.48550/arxiv.1607.07086 Publication Date: 2016-01-01

Abstract Supplemental Material References Cited by

AUTHORS (8)

Dzmitry Bahdanau

Philémon Brakel

Kelvin Xu

Anirudh Goyal

Ryan Lowe

Joëlle Pineau

Aaron Courville

Yoshua Bengio

ABSTRACT

We present an approach to training neural networks generate sequences using actor-critic methods from reinforcement learning (RL). Current log-likelihood are limited by the discrepancy between their and testing modes, as models must tokens conditioned on previous guesses rather than ground-truth tokens. address this problem introducing a \textit{critic} network that is trained predict value of output token, given policy \textit{actor} network. This results in procedure much closer test phase, allows us directly optimize for task-specific score such BLEU. Crucially, since we leverage these techniques supervised setting traditional RL setting, condition critic output. show our method leads improved performance both synthetic task, German-English machine translation. Our analysis paves way be applied natural language generation tasks, translation, caption generation, dialogue modelling.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

An Actor-Critic Algorithm for Sequence Prediction

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....