NFDI4DS | UHH-SEMS - Publication Details

Localize, Assemble, and Predicate: Contextual Object Proposal Embedding for Visual Relation Detection

Predicate (mathematical logic)

DOI: 10.1609/aaai.v34i07.6913 Publication Date: 2020-06-29T18:27:45Z

Abstract Supplemental Material References Cited by

AUTHORS (5)

Ruihai Wu

Kehan Xu

Chenchen Liu

Nan Zhuang

Yadong Mu

ABSTRACT

Visual relation detection (VRD) aims to describe all interacting objects in an image using subject-predicate-object triplets. Critically, valid relations combinatorially grow O(C2 R) for C object categories and R relationships. The frequencies of triplets exhibit a long-tailed distribution, which inevitably leads bias towards popular visual the learned VRD model. To address this problem, we propose localize-assemble-predicate network (LAP-Net), decomposes into three sub-tasks: localizing individual objects, assembling predicting subject-object pairs. In first stage LAP-Net, Region Proposal Network (RPN) is used generate few class-agnostic proposals. Next, these proposals are assembled form pairs via second Pair (PPN), novel contextual embedding scheme. inner product between embedded representations faithfully reflects compatibility pair proposals, without estimating subject class. Top-ranked from two fed third sub-network, precisely estimates relationship. whole pipeline except last object-category-agnostic relationships image, alleviating induced by training data. Our LAP-Net can be trained end-to-end fashion. We demonstrate that achieves state-of-the-art performance on benchmark while maintaining high speed inference.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (4)

EXTERNAL LINKS

OPENAIRE - Products CROSSREF - Publications OPENALEX - Publications

PlumX Metrics

Localize, Assemble, and Predicate: Contextual Object Proposal Embedding for Visual Relation Detection

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....