NFDI4DS | UHH-SEMS - Publication Details

VLSP 2022 Abmusu Task Dataset: A Resource for Vietnamese Abstractive Multi-Document Summarization

Vietnamese Benchmark (surveying) Multi-document summarization

DOI: 10.1142/s2717554523500030 Publication Date: 2023-05-17T07:18:47Z

Abstract Supplemental Material References Cited by

AUTHORS (4)

Quoc-An Nguyen

Duy-Cat Can

Hoang-Quynh Le

Mai-Vu Tran

ABSTRACT

The performance of automatic summarization systems has improved significantly with the development supervised approaches. However, in Vietnamese abstractive multi-document task, available datasets are insufficient for training model. With this motivation, we contribute a new gold standard dataset, named Abmusu. Following collecting and clustering articles, have built hierarchical annotation process to generate summaries, three roles: annotator, supervisor, curator. As result, dataset contains 600 news clusters formed from 1839 articles corresponding human-generated summaries. To best our knowledge, Abmusu is biggest that freely research. Moreover, summaries more concise, making it challenging train models. We also used various baselines benchmark dataset.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (22)

CITATIONS (1)

EXTERNAL LINKS

OPENAIRE - Products CROSSREF - Publications OPENALEX - Publications

PlumX Metrics

VLSP 2022 Abmusu Task Dataset: A Resource for Vietnamese Abstractive Multi-Document Summarization

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....