NFDI4DS | UHH-SEMS - Publication Details

Federico Tombari

ORCID: 0000-0001-5598-5212

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5041092666

Research Areas

Robotics and Sensor-Based Localization
Advanced Vision and Imaging
3D Shape Modeling and Analysis
Advanced Image and Video Retrieval Techniques
Advanced Neural Network Applications
Human Pose and Action Recognition
3D Surveying and Cultural Heritage
Computer Graphics and Visualization Techniques
Multimodal Machine Learning Applications
Video Surveillance and Tracking Methods
Generative Adversarial Networks and Image Synthesis
Domain Adaptation and Few-Shot Learning
Robot Manipulation and Learning
Anomaly Detection Techniques and Applications
Image Processing and 3D Reconstruction
Optical measurement and interference techniques
Image Processing Techniques and Applications
Image and Object Detection Techniques
Adversarial Robustness in Machine Learning
Medical Image Segmentation Techniques
Autonomous Vehicle Technology and Safety
Image Retrieval and Classification Techniques
Advanced Image Processing Techniques
Remote Sensing and LiDAR Applications
Surgical Simulation and Training

Google (Switzerland)
2019-2025

Technical University of Munich
2015-2024

Google (United States)
2019-2024

University of Bologna
2009-2022

University of Catania
2022

National Research Council
2022

Universidad de Las Palmas de Gran Canaria
2022

Menlo School
2022

Institut national de recherche en informatique et en automatique
2022

Amazon (United States)
2022

Deeper Depth Prediction with Fully Convolutional Residual Networks

OPENALEX - Publications

Iro Laina Christian Rupprecht Vasileios Belagiannis Federico Tombari Nassir Navab

This paper addresses the problem of estimating depth map a scene given single RGB image. We propose fully convolutional architecture, encompassing residual learning, to model ambiguous mapping between monocular images and maps. In order improve output resolution, we present novel way efficiently learn feature up-sampling within network. For optimization, introduce reverse Huber loss that is particularly suited for task at hand driven by value distributions commonly in Our composed...

10.1109/3dv.2016.32 article EN 2016-10-01

SSD-6D: Making RGB-Based 3D Detection and 6D Pose Estimation Great Again

OPENALEX - Publications

Wadim Kehl Fabian Manhardt Federico Tombari Slobodan Ilić Nassir Navab

We present a novel method for detecting 3D model instances and estimating their 6D poses from RGB data in single shot. To this end, we extend the popular SSD paradigm to cover full pose space train on synthetic only. Our approach competes or surpasses current state-of-the-art methods that leverage RGBD multiple challenging datasets. Furthermore, our produces these results at around 10Hz, which is many times faster than related methods. For sake of reproducibility, make trained networks...

10.1109/iccv.2017.169 article EN 2017-10-01

SHOT: Unique signatures of histograms for surface and texture description

OPENALEX - Publications

Samuele Salti Federico Tombari Luigi Di Stefano

10.1016/j.cviu.2014.04.011 article EN Computer Vision and Image Understanding 2014-05-06

CNN-SLAM: Real-Time Dense Monocular SLAM with Learned Depth Prediction

OPENALEX - Publications

Keisuke Tateno Federico Tombari Iro Laina Nassir Navab

Given the recent advances in depth prediction from Convolutional Neural Networks (CNNs), this paper investigates how predicted maps a deep neural network can be deployed for goal of accurate and dense monocular reconstruction. We propose method where CNN-predicted are naturally fused together with measurements obtained direct SLAM, based on scheme that privileges image locations SLAM approaches tend to fail, e.g. along low-textured regions, vice-versa. demonstrate use estimate absolute scale...

10.1109/cvpr.2017.695 preprint EN 2017-07-01

Performance Evaluation of 3D Keypoint Detectors

OPENALEX - Publications

Federico Tombari Samuele Salti Luigi Di Stefano

10.1007/s11263-012-0545-4 article EN International Journal of Computer Vision 2012-07-20

3D Point Capsule Networks

OPENALEX - Publications

Yongheng Zhao Tolga Birdal Haowen Deng Federico Tombari

In this paper, we propose 3D point-capsule networks, an auto-encoder designed to process sparse point clouds while preserving spatial arrangements of the input data. capsule networks arise as a direct consequence our unified formulation common auto-encoders. The dynamic routing scheme and peculiar 2D latent space deployed by bring in improvements for several cloud-related tasks, such object classification, reconstruction part segmentation substantiated extensive evaluations. Moreover, it...

10.1109/cvpr.2019.00110 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Tutorial: Point Cloud Library: Three-Dimensional Object Recognition and 6 DOF Pose Estimation

OPENALEX - Publications

Aitor Aldomà Zoltán-Csaba Márton Federico Tombari Walter Wohlkinger Christian Potthast and 4 more

With the advent of new-generation depth sensors, use three-dimensional (3-D) data is becoming increasingly popular. As these sensors are commodity hardware and sold at low cost, a rapidly growing group people can acquire 3- D cheaply in real time.

10.1109/mra.2012.2206675 article EN publisher-specific-oa IEEE Robotics & Automation Magazine 2012-09-01

Unique shape context for 3d data description

OPENALEX - Publications

Federico Tombari Samuele Salti Luigi Di Stefano

The use of robust feature descriptors is now key for many 3D tasks such as object recognition and surface alignment. Many have been proposed in literature which are based on a non-unique local Reference Frame hence require the computation multiple descriptions at each points. In this paper we show how to deploy unique improve accuracy reduce memory footprint well-known Shape Context descriptor. We validate our proposal by means an experimental analysis carried out large dataset scenes...

10.1145/1877808.1877821 article EN 2010-10-25

A combined texture-shape descriptor for enhanced 3D feature matching

OPENALEX - Publications

Federico Tombari Samuele Salti Luigi Di Stefano

Motivated by the increasing availability of 3D sensors capable delivering both shape and texture information, this paper presents a novel descriptor for feature matching in data enriched with texture. The proposed approach stems from theory recently which relies on only, represents its generalization to case multiple cues associated mesh. descriptor, dubbed CSHOT, is demonstrated notably improve accuracy challenging object recognition scenarios characterized presence clutter occlusions.

10.1109/icip.2011.6116679 article EN 2011-09-01

Neural Fields in Visual Computing and Beyond

OPENALEX - Publications

Yiheng Xie Towaki Takikawa Shunsuke Saito Or Litany Shiqin Yan and 5 more

Abstract Recent advances in machine learning have led to increased interest solving visual computing problems using methods that employ coordinate‐based neural networks. These methods, which we call fields , parameterize physical properties of scenes or objects across space and time. They seen widespread success such as 3D shape image synthesis, animation human bodies, reconstruction, pose estimation. Rapid progress has numerous papers, but a consolidation the discovered knowledge not yet...

10.1111/cgf.14505 article EN publisher-specific-oa Computer Graphics Forum 2022-05-01

Registration with the Point Cloud Library: A Modular Framework for Aligning in 3-D

OPENALEX - Publications

Dirk Holz Alexandru Eugen Ichim Federico Tombari Radu Bogdan Rusu Sven Behnke

Registration is an important step when processing three-dimensional (3-D) point clouds. Applications for registration range from object modeling and tracking, to simultaneous localization mapping (SLAM). This article presents the open-source cloud library (PCL) tools available registration. The PCL incorporates methods initial alignment of clouds using a variety local shape feature descriptors, as well refining alignments different variants well-known iterative closest (ICP) algorithm....

10.1109/mra.2015.2432331 article EN IEEE Robotics & Automation Magazine 2015-09-17

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation

OPENALEX - Publications

Gu Wang Fabian Manhardt Federico Tombari Xiangyang Ji

6D pose estimation from a single RGB image is fundamental task in computer vision. The current top-performing deep learning-based methods rely on an indirect strategy, i.e., first establishing 2D-3D correspondences between the coordinates plane and object coordinate system, then applying variant of PnP/RANSAC algorithm. However, this two-stage pipeline not end-to-end trainable, thus hard to be employed for many tasks requiring differentiable poses. On other hand, based direct regression are...

10.1109/cvpr46437.2021.01634 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Learning in an Uncertain World: Representing Ambiguity Through Multiple Hypotheses

OPENALEX - Publications

Christian Rupprecht Iro Laina Robert DiPietro Maximilian Baust Federico Tombari and 2 more

Many prediction tasks contain uncertainty. In some cases, uncertainty is inherent in the task itself. future prediction, for example, many distinct outcomes are equally valid. other arises from way data labeled. For object detection, objects of interest often go unlabeled, and human pose estimation, occluded joints labeled with ambiguous values. this work we focus on a principled approach handling such scenarios. particular, propose frame-work reformulating existing single-prediction models...

10.1109/iccv.2017.388 article EN 2017-10-01

Learning 3D Semantic Scene Graphs From 3D Indoor Reconstructions

OPENALEX - Publications

Johanna Wald Helisa Dhamo Nassir Navab Federico Tombari

Scene understanding has been of high interest in computer vision. It encompasses not only identifying objects a scene, but also their relationships within the given context. With this goal, recent line works tackles 3D semantic segmentation and scene layout prediction. In our work we focus on graphs, data structure that organizes entities graph, where are nodes modeled as edges. We leverage inference graphs way to carry out understanding, mapping relationships. particular, propose learned...

10.1109/cvpr42600.2020.00402 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

OPENALEX - Publications

Andy Zeng Adrian Wong Stefan Welker Krzysztof Choromański Federico Tombari and 6 more

Large pretrained (e.g., "foundation") models exhibit distinct capabilities depending on the domain of data they are trained on. While these domains generic, may only barely overlap. For example, visual-language (VLMs) Internet-scale image captions, but large language (LMs) further text with no images spreadsheets, SAT questions, code). As a result, store different forms commonsense knowledge across domains. In this work, we show that diversity is symbiotic, and can be leveraged through...

10.48550/arxiv.2204.00598 preprint EN other-oa arXiv (Cornell University) 2022-01-01

SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation

OPENALEX - Publications

Yan Di Fabian Manhardt Gu Wang Xiangyang Ji Nassir Navab and 1 more

Directly regressing all 6 degrees-of-freedom (6DoF) for the object pose (i.e. 3D rotation and translation) in a cluttered environment from single RGB image is challenging problem. While end-to-end methods have recently demonstrated promising results at high efficiency, they are still inferior when compared with elaborate PnP/RANSAC-based approaches terms of accuracy. In this work, we address shortcoming by means novel reasoning about self-occlusion, order to establish two-layer...

10.1109/iccv48922.2021.01217 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Learning Graph Embeddings for Compositional Zero-shot Learning

OPENALEX - Publications

Muhammad Ferjad Naeem Yongqin Xian Federico Tombari Zeynep Akata

In compositional zero-shot learning, the goal is to recognize unseen compositions (e.g. old dog) of observed visual primitives states old, cute) and objects car, in training set. This challenging because same state can for example alter appearance a dog drastically differently from car. As solution, we propose novel graph formulation called Compositional Graph Embedding (CGE) that learns image features, classifiers latent representations an end-to-end manner. The key our approach exploiting...

10.1109/cvpr46437.2021.00101 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

SHIFT: A Synthetic Driving Dataset for Continuous Multi-Task Domain Adaptation

OPENALEX - Publications

Tao Sun Mattia Segù Janis Postels Yuxuan Wang Luc Van Gool and 3 more

Adapting to a continuously evolving environment is safety-critical challenge inevitably faced by all autonomous-driving systems. Existing image- and video-based driving datasets, however, fall short of capturing the mutable nature real world. In this paper, we introduce largest multi-task synthetic dataset for autonomous driving, SHIFT. It presents discrete continuous shifts in cloudiness, rain fog intensity, time day, vehicle pedestrian density. Featuring comprehensive sensor suite...

10.1109/cvpr52688.2022.02068 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

ZebraPose: Coarse to Fine Surface Encoding for 6DoF Object Pose Estimation

OPENALEX - Publications

Yongzhi Su Mahdi Saleh Torben Fetzer Jason Rambach Nassir Navab and 3 more

Establishing correspondences from image to 3D has been a key task of 6DoF object pose estimation for long time. To predict more accurately, deeply learned dense maps replaced sparse templates. Dense methods also improved in the presence occlusion. More recently researchers have shown improvements by learning fragments as segmentation. In this work, we present discrete descriptor, which can represent surface densely. By incorporating hierarchical binary grouping, encode very efficiently....

10.1109/cvpr52688.2022.00662 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Coming Soon ...