NFDI4DS | UHH-SEMS - Publication Details

Interpretable and Globally Optimal Prediction for Textual Grounding using Image Concepts

Interpretability Bounding overwatch Minimum bounding box

DOI: 10.48550/arxiv.1803.11209 Publication Date: 2018-01-01

Abstract Supplemental Material References Cited by

AUTHORS (5)

Raymond A. Yeh

Jinjun Xiong

Wen‐mei Hwu

Nguyen Q. Minh

Alexander G. Schwing

ABSTRACT

Textual grounding is an important but challenging task for human-computer interaction, robotics and knowledge mining. Existing algorithms generally formulate the as selection from a set of bounding box proposals obtained deep net based systems. In this work, we demonstrate that can cast problem textual into unified framework permits efficient search over all possible boxes. Hence, method able to consider significantly more doesn't rely on successful first stage hypothesizing proposals. Beyond, trained parameters our model be used word-embeddings which capture spatial-image relationships provide interpretability. Lastly, at time submission, approach outperformed current state-of-the-art methods Flickr 30k Entities ReferItGame dataset by 3.08% 7.77% respectively.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

Interpretable and Globally Optimal Prediction for Textual Grounding using Image Concepts

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....