Tomáš Pajdla

ORCID: 0000-0001-6325-0072
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Vision and Imaging
  • Robotics and Sensor-Based Localization
  • Advanced Image and Video Retrieval Techniques
  • Optical measurement and interference techniques
  • Image Processing Techniques and Applications
  • Advanced Numerical Analysis Techniques
  • 3D Surveying and Cultural Heritage
  • Robotic Mechanisms and Dynamics
  • Image and Object Detection Techniques
  • Industrial Vision Systems and Defect Detection
  • Computational Geometry and Mesh Generation
  • Computer Graphics and Visualization Techniques
  • Advanced Image Processing Techniques
  • 3D Shape Modeling and Analysis
  • Image Processing and 3D Reconstruction
  • Satellite Image Processing and Photogrammetry
  • Polynomial and algebraic computation
  • Video Surveillance and Tracking Methods
  • Medical Image Segmentation Techniques
  • Image Retrieval and Classification Techniques
  • Advanced Neural Network Applications
  • Image and Video Stabilization
  • Advanced Measurement and Metrology Techniques
  • Remote Sensing and LiDAR Applications
  • COVID-19 diagnosis using AI

Czech Technical University in Prague
2015-2024

Institute of Informatics of the Slovak Academy of Sciences
2017-2022

Politecnico di Milano
2022

Calorx Teachers' University
2020-2021

China University of Labor Relations
2020

Center for Economic Research and Graduate Education – Economics Institute
2015-2017

Neovision (Czechia)
2013

Universität Ulm
2010

Charles University
2008

First Technical University
2007

We tackle the problem of large scale visual place recognition, where task is to quickly and accurately recognize location a given query photograph. present following three principal contributions. First, we develop convolutional neural network (CNN) architecture that trainable in an end-to-end manner directly for recognition task. The main component this architecture, NetVLAD, new generalized VLAD layer, inspired by "Vector Locally Aggregated Descriptors" image representation commonly used...

10.1109/tpami.2017.2711011 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2017-06-01

Abstract The wide-baseline stereo problem, i.e. the problem of establishing correspondences between a pair images taken from different viewpoints is studied. A new set image elements that are put into correspondence, so called extremal regions , introduced. Extremal possess highly desirable properties: closed under (1) continuous (and thus projective) transformation coordinates and (2) monotonic intensities. An efficient (near linear complexity) practically fast detection algorithm frame...

10.5244/c.16.36 article EN 2002-01-01

In this work we address the problem of finding reliable pixel-level correspondences under difficult imaging conditions. We propose an approach where a single convolutional neural network plays dual role: It is simultaneously dense feature descriptor and detector. By postponing detection to later stage, obtained keypoints are more stable than their traditional counterparts based on early low-level structures. show that model can be trained using pixel extracted from readily available...

10.1109/cvpr.2019.00828 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Visual localization enables autonomous vehicles to navigate in their surroundings and augmented reality applications link virtual real worlds. Practical visual approaches need be robust a wide variety of viewing condition, including day-night changes, as well weather seasonal variations, while providing highly accurate 6 degree-of-freedom (6DOF) camera pose estimates. In this paper, we introduce the first benchmark datasets specifically designed for analyzing impact such factors on...

10.1109/cvpr.2018.00897 preprint EN 2018-06-01

We address the problem of large-scale visual place recognition for situations where scene undergoes a major change in appearance, example, due to illumination (day/night), seasons, aging, or structural modifications over time such as buildings built destroyed. Such represent challenge current methods. This work has following three principal contributions. First, we demonstrate that matching across large changes appearance becomes much easier when both query image and database depict from...

10.1109/cvpr.2015.7298790 preprint EN 2015-06-01

Virtual immersive environments or telepresence setups often consist of multiple cameras that have to be calibrated. We present a convenient method for doing this. The minimum is three cameras, but there no upper limit. fully automatic and freely moving bright spot the only calibration object. A set virtual 3D points made by waving through working volume. Its projections are found with subpixel precision verified robust RANSAC analysis. do not see all points; reasonable overlap between camera...

10.1162/105474605774785325 article EN PRESENCE Virtual and Augmented Reality 2005-08-01

We seek to predict the 6 degree-of-freedom (6DoF) pose of a query photograph with respect large indoor 3D map. The contributions this work are three-fold. First, we develop new large-scale visual localization method targeted for environments. proceeds along three steps: (i) efficient retrieval candidate poses that ensures scalability environments, (ii) estimation using dense matching rather than local features deal texture less scenes, and (iii) verification by virtual view synthesis cope...

10.1109/cvpr.2018.00752 preprint EN 2018-06-01

We propose a novel method for the multi-view reconstruction problem. Surfaces which do not have direct support in input 3D point cloud and hence need be photo-consistent but represent real parts of scene (e.g. low-textured walls, windows, cars) are important achieving complete reconstructions. augmented existing Labatut CGF 2009 with ability to cope these difficult surfaces just by changing t-edge weights construction minimal s-t cut. Our uses Visual-Hull reconstruct sampled densely enough...

10.1109/cvpr.2011.5995693 article EN 2011-06-01

It is known that the problem of multiview reconstruction can be solved in two steps: first estimate camera rotations and then translations using them. This paper presents new robust techniques for both these steps. (i) Given pairwise relative rotations, global are estimated linearly least squares. (ii) Camera a standard technique based on Second Order Cone Programming. Robustness achieved by only subset points according to criterion diminishes risk chosing mismatch. shown four chosen special...

10.1109/cvpr.2007.383115 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2007-06-01

Repeated structures such as building facades, fences or road markings often represent a significant challenge for place recognition. are notoriously hard establishing correspondences using multi-view geometry. Even more importantly, they violate the feature independence assumed in bag-of-visual-words representation which leads to over-counting evidence and degradation of retrieval performance. In this work we show that repeated not nuisance but, when appropriately represented, form an...

10.1109/cvpr.2013.119 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2013-06-01

We address the problem of finding reliable dense correspondences between a pair images. This is challenging task due to strong appearance differences corresponding scene elements and ambiguities generated by repetitive patterns. The contributions this work are threefold. First, inspired classic idea disambiguating feature matches using semi-local constraints, we develop an end-to-end trainable convolutional neural network architecture that identifies sets spatially consistent analyzing...

10.48550/arxiv.1810.10510 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Visual localization enables autonomous vehicles to navigate in their surroundings and augmented reality applications link virtual real worlds. Practical visual approaches need be robust a wide variety of viewing conditions, including day-night changes, as well weather seasonal variations, while providing highly accurate six degree-of-freedom (6DOF) camera pose estimates. In this paper, we extend three publicly available datasets containing images captured under but lacking information, with...

10.1109/tpami.2020.3032010 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2020-10-21

10.1023/a:1019869530073 article EN International Journal of Computer Vision 2002-01-01

This paper presents a method for fully automatic and robust estimation of two-view geometry, autocalibration, 3D metric reconstruction from point correspondences in images taken by cameras with wide circular field view. We focus on which have more than 180 degrees view the standard perspective camera model is not sufficient, e.g., equipped fish-eye lenses Nikon FC-E8 (183 degrees), Sigma 8mm-f4-EX (180 or curved conical mirrors. assume axially symmetric image projection to autocalibrate...

10.1109/tpami.2006.151 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2006-05-25

This paper presents a general solution to the determination of pose perspective camera with unknown focal length from images four 3D reference points. Our problem is generalization P3P and P4P problems previously developed for fully calibrated cameras. Given 2D-to-3D correspondences, we estimate position, orientation recover length. We formulate provide minimal points by solving system algebraic equations. compare Hidden variable resultant Grobner basis techniques equations our problem. By...

10.1109/cvpr.2008.4587793 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2008-06-01
Coming Soon ...