SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking

Tracking (education)
DOI: 10.48550/arxiv.2409.11235 Publication Date: 2024-09-17
ABSTRACT
Open-vocabulary Multiple Object Tracking (MOT) aims to generalize trackers novel categories not in the training set. Currently, best-performing methods are mainly based on pure appearance matching. Due complexity of motion patterns large-vocabulary scenarios and unstable classification objects, semantics cues either ignored or applied heuristics final matching steps by existing methods. In this paper, we present a unified framework SLAck that jointly considers semantics, location, priors early association learns how integrate all valuable information through lightweight spatial temporal object graph. Our method eliminates complex post-processing for fusing different boosts performance significantly large-scale open-vocabulary tracking. Without bells whistles, outperform previous state-of-the-art classes tracking MOT TAO TETA benchmarks. code is available at \href{https://github.com/siyuanliii/SLAck}{github.com/siyuanliii/SLAck}.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES ()
CITATIONS ()
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....