NFDI4DS | UHH-SEMS - Publication Details

Monocular Quasi-Dense 3D Object Tracking

OPENALEX - Publications

Hou-Ning Hu Yung-Hsu Yang Tobias Fischer Trevor Darrell Fisher Yu and 1 more

A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects planning the observer's actions in numerous applications such as autonomous driving. We propose a that can effectively associate moving over time estimate their full bounding box information from sequence 2D images captured on platform. The object association leverages quasi-dense similarity learning to identify various poses viewpoints with appearance cues only. After initial...

10.1109/tpami.2022.3168781 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2022-04-19

UniDepth: Universal Monocular Metric Depth Estimation

OPENALEX - Publications

Luigi Piccinelli Yung-Hsu Yang Christos Sakaridis Mattia Segù Siyuan Li and 2 more

10.1109/cvpr52733.2024.00963 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

CR3DT: Camera-RADAR Fusion for 3D Detection and Tracking

OPENALEX - Publications

Nicolas Baumann Michael Baumgartner Edoardo Ghignone Jonas Kühne Tobias Fischer and 3 more

Accurate detection and tracking of surrounding objects is essential to enable self-driving vehicles. While Light Detection Ranging (LiDAR) sensors have set the benchmark for high performance, appeal camera-only solutions lies in their cost-effectiveness. Notably, despite prevalent use Radio (RADAR) automotive systems, potential 3D has been largely disregarded due data sparsity measurement noise. As a recent development, combination RADARs cameras emerging as promising solution. This paper...

10.48550/arxiv.2403.15313 preprint EN arXiv (Cornell University) 2024-03-22

CC-3DT: Panoramic 3D Object Tracking via Cross-Camera Fusion

OPENALEX - Publications

Tobias Fischer Yung-Hsu Yang Suryansh Kumar Min Sun Fisher Yu

To track the 3D locations and trajectories of other traffic participants at any given time, modern autonomous vehicles are equipped with multiple cameras that cover vehicle's full surroundings. Yet, camera-based object tracking methods prioritize optimizing single-camera setup resort to post-hoc fusion in a multi-camera setup. In this paper, we propose method for panoramic tracking, called CC-3DT, associates models both temporally across views, improves overall consistency. particular, our...

10.48550/arxiv.2212.01247 preprint EN cc-by arXiv (Cornell University) 2022-01-01

Dense Prediction with Attentive Feature Aggregation

OPENALEX - Publications

Yung-Hsu Yang Thomas E. Huang Min Sun Samuel Rota Bulò Peter Kontschieder and 1 more

Aggregating information from features across different layers is essential for dense prediction models. Despite its limited expressiveness, vanilla feature concatenation dominates the choice of aggregation operations. In this paper, we introduce Attentive Feature Aggregation (AFA) to fuse network with more expressive non-linear AFA exploits both spatial and channel attention compute weighted averages layer activations. Inspired by neural volume rendering, further extend Scale-Space Rendering...

10.1109/wacv56688.2023.00018 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023-01-01

UniDepth: Universal Monocular Metric Depth Estimation

OPENALEX - Publications

Luigi Piccinelli Yung-Hsu Yang Christos Sakaridis Mattia Segù Siyuan Li and 2 more

Accurate monocular metric depth estimation (MMDE) is crucial to solving downstream tasks in 3D perception and modeling. However, the remarkable accuracy of recent MMDE methods confined their training domains. These fail generalize unseen domains even presence moderate domain gaps, which hinders practical applicability. We propose a new model, UniDepth, capable reconstructing scenes from solely single images across Departing existing methods, UniDepth directly predicts points input image at...

10.48550/arxiv.2403.18913 preprint EN arXiv (Cornell University) 2024-03-27

Monocular Quasi-Dense 3D Object Tracking

OPENALEX - Publications

Hou-Ning Hu Yung-Hsu Yang Tobias Fischer Trevor Darrell Fisher Yu and 1 more

A reliable and accurate 3D tracking framework is essential for predicting future locations of surrounding objects planning the observer's actions in numerous applications such as autonomous driving. We propose a that can effectively associate moving over time estimate their full bounding box information from sequence 2D images captured on platform. The object association leverages quasi-dense similarity learning to identify various poses viewpoints with appearance cues only. After initial...

10.48550/arxiv.2103.07351 preprint EN other-oa arXiv (Cornell University) 2021-01-01

CR3DT: Camera-RADAR Fusion for 3D Detection and Tracking

OPENALEX - Publications

Nicolas Baumann Michael Baumgartner Edoardo Ghignone Jonas Kühne Tobias Fischer and 3 more

10.1109/iros58592.2024.10801848 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2024-10-14

SLAck: Semantic, Location, and Appearance Aware Open-Vocabulary Tracking

OPENALEX - Publications

Siyuan Li Ke Lei Yung-Hsu Yang Luigi Piccinelli Mattia Segù and 2 more

Open-vocabulary Multiple Object Tracking (MOT) aims to generalize trackers novel categories not in the training set. Currently, best-performing methods are mainly based on pure appearance matching. Due complexity of motion patterns large-vocabulary scenarios and unstable classification objects, semantics cues either ignored or applied heuristics final matching steps by existing methods. In this paper, we present a unified framework SLAck that jointly considers semantics, location, priors...

10.48550/arxiv.2409.11235 preprint EN arXiv (Cornell University) 2024-09-17

Samba: Synchronized Set-of-Sequences Modeling for Multiple Object Tracking

OPENALEX - Publications

Mattia Segù Luigi Piccinelli Siyuan Li Yung-Hsu Yang Bernt Schiele and 1 more

Multiple object tracking in complex scenarios - such as coordinated dance performances, team sports, or dynamic animal groups presents unique challenges. In these settings, objects frequently move patterns, occlude each other, and exhibit long-term dependencies their trajectories. However, it remains a key open research question on how to model long-range within tracklets, interdependencies among the associated temporal occlusions. To this end, we introduce Samba, novel linear-time...

10.48550/arxiv.2410.01806 preprint EN arXiv (Cornell University) 2024-10-02

Dense Prediction with Attentive Feature Aggregation

OPENALEX - Publications

Yung-Hsu Yang Thomas E. Huang Samuel Rota Bulò Peter Kontschieder Fisher Yu

Aggregating information from features across different layers is an essential operation for dense prediction models. Despite its limited expressiveness, feature concatenation dominates the choice of aggregation operations. In this paper, we introduce Attentive Feature Aggregation (AFA) to fuse network with more expressive non-linear AFA exploits both spatial and channel attention compute weighted average layer activations. Inspired by neural volume rendering, extend Scale-Space Rendering...

10.48550/arxiv.2111.00770 preprint EN other-oa arXiv (Cornell University) 2021-01-01