ARMAN: Pre-training with Semantically Selecting and Reordering of Sentences for Persian Abstractive Summarization

FOS: Computer and information sciences Computer Science - Computation and Language 0202 electrical engineering, electronic engineering, information engineering 02 engineering and technology Computation and Language (cs.CL)
DOI: 10.48550/arxiv.2109.04098 Publication Date: 2021-01-01
ABSTRACT
Abstractive text summarization is one of the areas influenced by emergence pre-trained language models. Current pre-training works in abstractive give more points to summaries with words common main and pay less attention semantic similarity between generated sentences original document. We propose ARMAN, a Transformer-based encoder-decoder model three novel objectives address this issue. In salient from document are selected according modified score be masked form pseudo summary. To summarize accurately similar human writing patterns, we applied sentence reordering. evaluated our proposed models on six downstream Persian tasks. Experimental results show that achieves state-of-the-art performance all tasks measured ROUGE BERTScore. Our also outperform prior textual entailment, question paraphrasing, multiple choice answering. Finally, established evaluation using significantly improves results.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....