NFDI4DS | UHH-SEMS - Publication Details

MarkupLM: Pre-training of Text and Markup Language for Visually Rich Document Understanding

AKA Document layout analysis Code (set theory) Scalable Vector Graphics Plain text

DOI: 10.18653/v1/2022.acl-long.420 Publication Date: 2022-06-03T01:34:53Z

Abstract Supplemental Material References Cited by

AUTHORS (4)

Junlong Li

Yiheng Xu

Lei Cui

Furu Wei

ABSTRACT

Multimodal pre-training with text, layout, and image has made significant progress for Visually Rich Document Understanding (VRDU), especially the fixed-layout documents such as scanned document images. While, there are still a large number of digital where layout information is not fixed needs to be interactively dynamically rendered visualization, making existing layout-based approaches easy apply. In this paper, we propose MarkupLM understanding tasks markup languages backbone, HTML/XML-based documents, text jointly pre-trained. Experiment results show that pre-trained significantly outperforms strong baseline models on several tasks. The model code will publicly available at https://aka.ms/markuplm.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (13)

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products CROSSREF - Publications

PlumX Metrics

MarkupLM: Pre-training of Text and Markup Language for Visually Rich Document Understanding

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....