Cross-media Structured Common Space for Multimedia Event Extraction

Benchmark (surveying) Argument (complex analysis)
DOI: 10.18653/v1/2020.acl-main.230 Publication Date: 2020-07-29T14:14:43Z
ABSTRACT
We introduce a new task, MultiMedia Event Extraction, which aims to extract events and their arguments from multimedia documents. develop the first benchmark collect dataset of 245 news articles with extensively annotated arguments. propose novel method, Weakly Aligned Structured Embedding (WASE), that encodes structured representations semantic information textual visual data into common embedding space. The structures are aligned across modalities by employing weakly supervised training strategy, enables exploiting available resources without explicit cross-media annotation. Compared uni-modal state-of-the-art methods, our approach achieves 4.0% 9.8% absolute F-score gains on text event argument role labeling extraction. unstructured representations, we achieve 8.3% 5.0% extraction labeling, respectively. By utilizing images, 21.4% more mentions than traditional text-only methods.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (38)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....