Horst Possegger

ORCID: 0000-0002-5427-9938
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Neural Network Applications
  • Domain Adaptation and Few-Shot Learning
  • Video Surveillance and Tracking Methods
  • Human Pose and Action Recognition
  • Multimodal Machine Learning Applications
  • Anomaly Detection Techniques and Applications
  • Advanced Image and Video Retrieval Techniques
  • Face recognition and analysis
  • Adversarial Robustness in Machine Learning
  • Autonomous Vehicle Technology and Safety
  • Advanced Vision and Imaging
  • Remote Sensing and LiDAR Applications
  • Traffic Prediction and Management Techniques
  • 3D Surveying and Cultural Heritage
  • 3D Shape Modeling and Analysis
  • Target Tracking and Data Fusion in Sensor Networks
  • Visual Attention and Saliency Detection
  • Advanced Optical Sensing Technologies
  • Robotics and Sensor-Based Localization
  • Infrastructure Maintenance and Monitoring
  • Image Enhancement Techniques
  • Vehicle License Plate Recognition
  • Traffic and Road Safety
  • Gait Recognition and Analysis
  • Hand Gesture Recognition Systems

Graz University of Technology
2015-2024

Institute of Computer Vision and Applied Computer Sciences
2012

The Visual Object Tracking challenge 2015, VOT2015, aims at comparing short-term single-object visual trackers that do not apply pre-learned models of object appearance. Results 62 are presented. number tested makes VOT 2015 the largest benchmark on tracking to date. For each participating tracker, a short description is provided in appendix. Features VOT2015 go beyond its VOT2014 predecessor are: (i) new dataset twice as large with full annotation targets by rotated bounding boxes and...

10.1109/iccvw.2015.79 preprint EN 2015-12-01

In this paper, we address the problem of model-free online object tracking based on color representations. According to findings recent benchmark evaluations, such trackers often tend drift towards regions which exhibit a similar appearance compared interest. To overcome limitation, propose an efficient discriminative model allows us identify potentially distracting in advance. Furthermore, exploit knowledge adapt representation beforehand so that distractors are suppressed and risk drifting...

10.1109/cvpr.2015.7298823 article EN 2015-06-01

Learning similarity functions between image pairs with deep neural networks yields highly correlated activations of embeddings. In this work, we show how to improve the robustness such embeddings by exploiting independence within ensembles. To end, divide last embedding layer a network into an ensemble and formulate task training as online gradient boosting problem. Each learner receives reweighted sample from previous learners. Further, propose two loss which increase diversity in our...

10.1109/tpami.2018.2848925 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2018-06-25

Learning similarity functions between image pairs with deep neural networks yields highly correlated activations of large embeddings. In this work, we show how to improve the robustness embeddings by exploiting independence in ensembles. We divide last embedding layer a network into an ensemble and formulate training as online gradient boosting problem. Each learner receives reweighted sample from previous learners. This leverages sizes more effectively significantly reducing correlation...

10.1109/iccv.2017.555 article EN 2017-10-01

Robust multi-object tracking-by-detection requires the correct assignment of noisy detection results to object trajectories. We address this problem by proposing an online approach based on observation that detectors primarily fail if objects are significantly occluded. In contrast most existing work, we only rely geometric information efficiently overcome failures. particular, exploit spatio-temporal evolution occlusion regions, detector reliability, and target motion prediction robustly...

10.1109/cvpr.2014.170 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2014-06-01

Domain adaptation is crucial to adapt a learned model new scenarios, such as domain shifts or changing data distributions. Current approaches usually require large amount of labeled unlabeled from the shifted domain. This can be hurdle in fields which continuous dynamic suffer scarcity data, e.g. autonomous driving challenging weather conditions. To address this problem distribution shifts, we propose Dynamic Unsupervised Adaptation (DUA). By continuously adapting statistics batch...

10.1109/cvpr52688.2022.01435 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

We introduce a simple yet effective fusion method of LiDAR and RGB data to segment point clouds. Utilizing the dense native range representation sensor setup calibration, we establish correspondences between two input modalities. Subsequently, are able warp fuse features from one domain into other. Therefore, can jointly exploit information both sources within single network. To show merit our method, extend SqueezeSeg, cloud segmentation network, with an feature branch it original...

10.1109/wacv45572.2020.9093584 article EN 2020-03-01

Although deep neural networks enable impressive visual perception performance for autonomous driving, their robustness to varying weather conditions still requires attention. When adapting these models changed environments, such as different conditions, they are prone forgetting previously learned information. This catastrophic is typically addressed via incremental learning approaches which usually re-train the model by either keeping a memory bank of training samples or copy entire...

10.1109/cvprw56347.2022.00339 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2022-06-01

Large scale Vision Language (VL) models have shown tremendous success in aligning representations between visual and text modalities. This enables remarkable progress zero-shot recognition, image generation & editing, many other exciting tasks. However, VL tend to over-represent objects while paying much less attention verbs, require additional tuning on video data for best action recognition performance. While previous work relied large-scale, fully-annotated data, this we propose an...

10.1109/iccv51070.2023.00267 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Although action recognition systems can achieve top performance when evaluated on in-distribution test points, they are vulnerable to unanticipated distribution shifts in data. However, test-time adaptation of video models against common has so far not been demonstrated. We propose address this problem with an approach tailored spatio-temporal that is capable a single sample at step. It consists feature alignment technique aligns online estimates set statistics towards the training...

10.1109/cvpr52729.2023.02198 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

We present a novel video saliency detection method to support human activity recognition and weakly supervised training of algorithms. Recent research has emphasized the need for analyzing salient information in videos minimize dataset bias or supervise labeled detectors. In contrast previous methods we do not rely on given by either eye-gaze annotation data, but propose fully unsupervised algorithm find regions within videos. general, enforce Gestalt principle figure-ground segregation both...

10.1109/cvpr.2015.7298864 article EN 2015-06-01

Combining foreground images from multiple views by projecting them onto a common ground-plane has been recently applied within many multi-object tracking approaches. These planar projections introduce severe artifacts and constrain most approaches to objects moving on 2D ground-plane. To overcome these limitations, we the concept of an occupancy volume - exploiting full geometry objects' center mass develop efficient algorithm for 3D object tracking. Individual are tracked using local...

10.1109/cvpr.2013.310 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2013-06-01

Test-Time-Training (TTT) is an approach to cope with out-of-distribution (OOD) data by adapting a trained model distribution shifts occurring at test-time. We propose perform this adaptation via Activation Matching (ActMAD): analyze activations of the and align activation statistics OOD test those training data. In contrast existing methods, which entire channels in ultimate layer feature extractor, we each multiple layers across network. This results more fine-grained supervision makes...

10.1109/cvpr52729.2023.02313 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

While 3D object detection in LiDAR point clouds is well-established academia and industry, the explainability of these models a largely unexplored field. In this paper, we propose method to generate attribution maps for detected objects order better understand behavior such models. These indicate importance each predicting specific objects. Our works with black-box models: We do not require any prior knowledge architecture nor access model's internals, like parameters, activations or...

10.1109/cvpr52688.2022.00121 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

We present a novel approach for spatiotemporal saliency detection by optimizing unified criterion of color contrast, motion appearance, and background cues. To this end, we first abstract the video temporal superpixels. Second, propose graph structure exploiting cues to assign edge weights. The salient segments are then extracted applying spectral foreground method, quantum cuts, on graph. evaluate our several public datasets activity localization demonstrate favorable performance proposed...

10.1109/tmm.2017.2713982 article EN IEEE Transactions on Multimedia 2017-06-08

Our MATE is the first Test-Time-Training (TTT) method designed for 3D data, which makes deep networks trained point cloud classification robust to distribution shifts occurring in test data. Like existing TTT methods from 2D image domain, also leverages data adaptation. Its test-time objective that of a Masked Autoencoder: large portion each removed before it fed network, tasked with reconstructing full cloud. Once network updated, used classify We on several object datasets and show...

10.1109/iccv51070.2023.01532 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

The sensing process of large-scale LiDAR point clouds inevitably causes large blind spots, i.e. regions not visible to the sensor. We demonstrate how these inherent sampling properties can be effectively utilized for self-supervised representation learning by designing a highly effective pretraining framework that considerably reduces need tedious 3D annotations train state-of-the-art object detectors. Our Masked AutoEncoder (MAELi) intuitively leverages sparsity in both encoder and decoder...

10.1109/wacv57701.2024.00335 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024-01-03
Coming Soon ...