NFDI4DS | UHH-SEMS - Publication Details

S^2Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR

FOS: Computer and information sciences Transformer 3D surgical scene understanding Computer Vision and Pattern Recognition (cs.CV) Computer Science - Computer Vision and Pattern Recognition Scene graph generation Single-stage Bi-modal

DOI: 10.48550/arxiv.2402.14461 Publication Date: 2024-02-22

Abstract Supplemental Material References Cited by

AUTHORS (6)

Jialun Pei

Diandian Guo

Jingyang Zhang

Manxi Lin

Yueming Jin

Pheng‐Ann Heng

ABSTRACT

Scene graph generation (SGG) of surgical procedures is crucial in enhancing holistically cognitive intelligence the operating room (OR). However, previous works have primarily relied on multi-stage learning that generates semantic scene graphs dependent intermediate processes with pose estimation and object detection, which may compromise model efficiency efficacy, also impose extra annotation burden. In this study, we introduce a novel single-stage bimodal transformer framework for SGG OR, termed S^2Former-OR, aimed to complementally leverage multi-view 2D scenes 3D point clouds an end-to-end manner. Concretely, our embraces View-Sync Transfusion scheme encourage visual information interaction. Concurrently, Geometry-Visual Cohesion operation designed integrate synergic features into cloud features. Moreover, based augmented feature, propose relation-sensitive decoder embeds dynamic entity-pair queries relational trait priors, enables direct prediction relations without steps. Extensive experiments validated superior performance lower computational cost S^2Former-OR 4D-OR benchmark, compared current OR-SGG methods, e.g., 3% Precision increase 24.2M reduction parameters. We further method generic methods broader metrics comprehensive evaluation, consistently better achieved. The code will be made available.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications

PlumX Metrics

S^2Former-OR: Single-Stage Bimodal Transformer for Scene Graph Generation in OR

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....