NFDI4DS | UHH-SEMS - Publication Details

MUSTAN: Multi-scale Temporal Context as Attention for Robust Video Foreground Segmentation

DOI: 10.48550/arxiv.2402.00918 Publication Date: 2024-02-01

Abstract Supplemental Material References Cited by

AUTHORS (4)

Praveen Kumar Pokala

Jaya Sai Kiran Pa...

Naveen Pandey

Balakrishna Reddy...

ABSTRACT

Video foreground segmentation (VFS) is an important computer vision task wherein one aims to segment the objects under motion from background. Most of current methods are image-based, i.e., rely only on spatial cues while ignoring cues. Therefore, they tend overfit training data and don't generalize well out-of-domain (OOD) distribution. To solve above problem, prior works exploited several such as optical flow, background subtraction mask, etc. However, having a video with annotations like flow challenging task. In this paper, we utilize temporal information improve OOD performance. challenge lies in how model given interpretable way creates very noticeable difference. We therefore devise strategy that integrates context development VFS. Our approach give rise deep learning architectures, namely MUSTAN1 MUSTAN2 based idea multi-scale attention, aids our models learn better representations beneficial for Further, introduce new dataset, Indoor Surveillance Dataset (ISD) It has multiple frame level binary depth map, instance semantic annotations. ISD can benefit other tasks. validate efficacy architectures compare performance baselines. demonstrate proposed significantly outperform benchmark OOD. addition, improved certain categories due ISD.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications

PlumX Metrics

MUSTAN: Multi-scale Temporal Context as Attention for Robust Video Foreground Segmentation

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....