NFDI4DS | UHH-SEMS - Publication Details

prophetnet predicting future n gram for sequence to sequence pre training

FOS: Computer and information sciences Computer Science - Computation and Language 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology Computation and Language (cs.CL)

DOI: 10.48550/arxiv.2001.04063 Publication Date: 2020-01-01

Abstract Supplemental Material References Cited by

AUTHORS (8)

Weizhen Qi

Yeyun Gong

Ming Zhou

Ruofei Zhang

Dayiheng Liu

Nan Duan

Yu Yan

Jiusheng Chen

ABSTRACT

Accepted to EMNLP 2020 Findings. Project page: https://github.com/microsoft/ProphetNet<br/>This paper presents a new sequence-to-sequence pre-training model called ProphetNet, which introduces a novel self-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism. Instead of optimizing one-step-ahead prediction in the traditional sequence-to-sequence model, the ProphetNet is optimized by n-step ahead prediction that predicts the next n tokens simultaneously based on previous context tokens at each time step. The future n-gram prediction explicitly encourages the model to plan for the future tokens and prevent overfitting on strong local correlations. We pre-train ProphetNet using a base scale dataset (16GB) and a large-scale dataset (160GB), respectively. Then we conduct experiments on CNN/DailyMail, Gigaword, and SQuAD 1.1 benchmarks for abstractive summarization and question generation tasks. Experimental results show that ProphetNet achieves new state-of-the-art results on all these datasets compared to the models using the same scale pre-training corpus.<br/>

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products

PlumX Metrics

prophetnet predicting future n gram for sequence to sequence pre training

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....