- Robotics and Sensor-Based Localization
- Advanced Vision and Imaging
- Advanced Image and Video Retrieval Techniques
- Advanced Neural Network Applications
- 3D Surveying and Cultural Heritage
- Robot Manipulation and Learning
- Image and Object Detection Techniques
- Optical measurement and interference techniques
- Human Pose and Action Recognition
- Industrial Vision Systems and Defect Detection
- 3D Shape Modeling and Analysis
- Medical Image Segmentation Techniques
- Domain Adaptation and Few-Shot Learning
- Image Retrieval and Classification Techniques
- Access Control and Trust
- Artificial Intelligence in Healthcare and Education
- Indoor and Outdoor Localization Technologies
- Hand Gesture Recognition Systems
- Cloud Data Security Solutions
- Remote Sensing and LiDAR Applications
- Multimodal Machine Learning Applications
- Video Surveillance and Tracking Methods
- Cryptography and Data Security
- Anatomy and Medical Technology
- Augmented Reality Applications
Heidelberg University
2017-2021
Heidelberg University
2017-2019
TU Dresden
2012-2017
RANSAC is an important algorithm in robust optimization and a central building block for many computer vision applications. In recent years, traditionally hand-crafted pipelines have been replaced by deep learning pipelines, which can be trained end-to-end fashion. However, has so far not used as part of such because its hypothesis selection procedure non-differentiable. this work, we present two different ways to overcome limitation. The most promising approach inspired reinforcement...
In recent years, the task of estimating 6D pose object instances and complete scenes, i.e. camera localization, from a single input image has received considerable attention. Consumer RGB-D cameras have made this feasible, even for difficult, texture-less objects scenes. work, we show that RGB is sufficient to achieve visually convincing results. Our key concept model exploit uncertainty system at all stages processing pipeline. The comes in form continuous distributions over 3D coordinates...
Popular research areas like autonomous driving and augmented reality have renewed the interest in image-based camera localization. In this work, we address task of predicting 6D pose from a single RGB image given 3D environment. With advent neural networks, previous works either learned entire localization process, or multiple components pipeline. Our key contribution is to demonstrate explain that learning component pipeline sufficient. This fully convolutional network for densely...
We present Neural-Guided RANSAC (NG-RANSAC), an extension to the classic algorithm from robust optimization. NG-RANSAC uses prior information improve model hypothesis search, increasing chance of finding outlier-free minimal sets. Previous works use heuristic side-information like hand-crafted descriptor distance guide search. In contrast, we learn search in a principled fashion that lets us optimize arbitrary task loss during training, leading large improvements on computer vision tasks....
Analysis-by-synthesis has been a successful approach for many tasks in computer vision, such as 6D pose estimation of an object RGB-D image which is the topic this work. The idea to compare observation with output forward process, rendered interest particular pose. Due occlusion or complicated sensor noise, it can be difficult perform comparison meaningful way. We propose that "learns compare", while taking these difficulties into account. This done by describing posterior density...
We describe a learning-based system that estimates the camera position and orientation from single input image relative to known environment. The is flexible w.r.t. amount of information available at test training time, catering different applications. Input images can be RGB-D or RGB, 3D model environment utilized for but not necessary. In minimal case, our requires only RGB ground truth poses it time. framework consists deep neural network fully differentiable pose optimization. predicts...
This paper addresses the task of estimating 6D-pose a known 3D object from single RGB-D image. Most modern approaches solve this in three steps: i) compute local features, ii) generate pool pose-hypotheses, iii) select and refine pose pool. work focuses on second step. While all existing hypotheses via reasoning, e.g. RANSAC or Hough-Voting, we are first to show that global reasoning is beneficial at stage. In particular, formulate novel fully-connected Conditional Random Field (CRF) outputs...
Fitting model parameters to a set of noisy data points is common problem in computer vision. In this work, we fit the 6D camera pose correspondences between 2D input image and known 3D environment. We estimate these from using neural network. Since often contain outliers, utilize robust estimator such as Random Sample Consensus (RANSAC) or Differentiable RANSAC (DSAC) parameters. When domain, e.g. space all 2D-3D correspondences, large ambiguous, single network does not cover domain well....
We address a core problem of computer vision: Detection and description 2D feature points for image matching. For long time, hand-crafted designs, like the seminal SIFT algorithm, were unsurpassed in accuracy efficiency. Recently, learned detectors emerged that implement detection using neural networks. Training these networks usually resorts to optimizing low-level matching scores, often pre-defining sets patches which should or not match, contain key points. Unfortunately, increased scores...
We present the evaluation methodology, datasets and results of BOP Challenge 2022, fourth in a series public competitions organized with goal to capture status quo field 6D object pose estimation from an RGB/RGB-D image. In we witnessed another significant improvement accuracy – state art, which was 56.9 AR <inf xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">C</inf> 2019 (Vidal et al.) 69.8 2020 (CosyPose), moved new heights 83.7 (GDRNPP). Out 49...
Learning-based visual relocalizers exhibit leading pose accuracy, but require hours or days of training. Since training needs to happen on each new scene again, long times make learning-based relocalization impractical for most applications, despite its promise high accuracy. In this paper we show how such a system can actually achieve the same accuracy in less than 5 minutes. We start from obvious: network be split scene-agnostic feature backbone, and scene-specific prediction head. Less...
This work addresses the task of camera localization in a known 3D scene given single input RGB image. State-of-the-art approaches accomplish this two steps: firstly, regressing for every pixel image its coordinate and subsequently, using these coordinates to estimate final 6D pose via RANSAC. To solve first step. Random Forests (RFs) are typically used. On other hand. Neural Networks (NNs) reign many dense regression tasks, but not test-time efficient. We ask question: which is best...
We present a robust estimator for fitting multiple parametric models of the same form to noisy measurements. Applications include finding vanishing points in man-made scenes, planes architectural imagery, or estimating rigid motions within sequence. In contrast previous works, which resorted hand-crafted search strategies model detection, we learn strategy from data. A neural network conditioned on previously detected guides RANSAC different subsets all measurements, thereby instances one...
Benchmark datasets that measure camera pose accuracy have driven progress in visual re-localisation research. To obtain poses for thousands of images, it is common to use a reference algorithm generate pseudo ground truth. Popular choices include Structure-from-Motion (SfM) and Simultaneous-Localisation-and-Mapping (SLAM) using additional sensors like depth cameras if available. Re-localisation benchmarks thus how well each method replicates the results algorithm. This begs question whether...
State-of-the-art computer vision algorithms often achieve efficiency by making discrete choices about which hypotheses to explore next. This allows allocation of computational resources promising candidates, however, such decisions are non-differentiable. As a result, these hard train in an end-to-end fashion. In this work we propose learn efficient algorithm for the task 6D object pose estimation. Our system optimizes parameters existing state-of-the art estimation using reinforcement...
Accurate pose estimation of object instances is a key aspect in many applications, including augmented reality or robotics. For example, task domestic robot could be to fetch an item from open drawer. The poses both, the drawer and have known by order fulfil task. 6D rigid objects has been addressed with great success recent years. In large part, this due advent consumer-level RGB-D cameras, which provide rich, robust input data. However, practical use state-of-the-art approaches limited...