Towards Robust Visual Information Extraction in Real World: New Dataset and Novel Solution

Spotting Benchmark (surveying) Keyword spotting
DOI: 10.48550/arxiv.2102.06732 Publication Date: 2021-01-01
ABSTRACT
Visual information extraction (VIE) has attracted considerable attention recently owing to its various advanced applications such as document understanding, automatic marking and intelligent education. Most existing works decoupled this problem into several independent sub-tasks of text spotting (text detection recognition) extraction, which completely ignored the high correlation among them during optimization. In paper, we propose a robust visual system (VIES) towards real-world scenarios, is unified end-to-end trainable framework for simultaneous detection, recognition by taking single image input outputting structured information. Specifically, branch collects abundant semantic representations from multimodal feature fusion conversely, provides higher-level clues contribute optimization spotting. Moreover, regarding shortage public benchmarks, construct fully-annotated dataset called EPHOIE (https://github.com/HCIILAB/EPHOIE), first Chinese benchmark both extraction. consists 1,494 images examination paper head with complex layouts background, including total 15,771 handwritten or printed instances. Compared state-of-the-art methods, our VIES shows significant superior performance on achieves 9.01% F-score gain widely used SROIE under scenario.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....