Lukáš Neumann

ORCID: 0000-0002-9428-3712
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Handwritten Text Recognition Techniques
  • Image Retrieval and Classification Techniques
  • Video Surveillance and Tracking Methods
  • Advanced Vision and Imaging
  • Autonomous Vehicle Technology and Safety
  • Advanced Neural Network Applications
  • Advanced Image and Video Retrieval Techniques
  • Vehicle License Plate Recognition
  • Human Pose and Action Recognition
  • Image Processing and 3D Reconstruction
  • Robotics and Sensor-Based Localization
  • Natural Language Processing Techniques
  • Anomaly Detection Techniques and Applications
  • Optical measurement and interference techniques
  • Astronomy and Astrophysical Research
  • Image Processing Techniques and Applications
  • Food Supply Chain Traceability
  • Vehicle Dynamics and Control Systems
  • Luminescence Properties of Advanced Materials
  • Identification and Quantification in Food
  • Mathematics, Computing, and Information Processing
  • Neural Networks and Applications
  • Nonlinear Optical Materials Research
  • Advanced Chemical Physics Studies
  • Video Analysis and Summarization

Czech Technical University in Prague
2011-2024

University of Oxford
2019-2021

Oxford Research Group
2021

Technische Universität Berlin
2017-2019

Results of the ICDAR 2015 Robust Reading Competition are presented. A new Challenge 4 on Incidental Scene Text has been added to Challenges Born-Digital Images, Focused Images and Video Text. is run a newly acquired dataset 1,670 images evaluating Localisation, Word Recognition End-to-End pipelines. In addition, for 3 substantially updated with more video sequences accurate ground truth data. Finally, tasks assessing system performance have introduced all Challenges. The competition took...

10.1109/icdar.2015.7333942 article EN 2015-08-01

An end-to-end real-time scene text localization and recognition method is presented. The performance achieved by posing the character detection problem as an efficient sequential selection from set of Extremal Regions (ERs). ER detector robust to blur, illumination, color texture variation handles low-contrast text. In first classification stage, probability each being a estimated using novel features calculated with O(1) complexity per region tested. Only ERs locally maximal are selected...

10.1109/cvpr.2012.6248097 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2012-06-01

A method for scene text localization and recognition is proposed. The novelties include: training of both detection in a single end-to-end pass, the structure CNN geometry its input layer that preserves aspect adapts resolution to data.,,The proposed achieves state-of-the-art accuracy on two standard datasets – ICDAR 2013 2015, whilst being an order magnitude faster than competing methods - whole pipeline runs at 10 frames per second NVidia K80 GPU.

10.1109/iccv.2017.242 article EN 2017-10-01

This paper describes the COCO-Text dataset. In recent years large-scale datasets like SUN and Imagenet drove advancement of scene understanding object recognition. The goal is to advance state-of-the-art in text detection recognition natural images. dataset based on MS COCO dataset, which contains images complex everyday scenes. were not collected with mind thus contain a broad variety instances. To reflect diversity scenes, we annotate (a) location terms bounding box, (b) fine-grained...

10.48550/arxiv.1601.07140 preprint EN other-oa arXiv (Cornell University) 2016-01-01

An unconstrained end-to-end text localization and recognition method is presented. The introduces a novel approach for character detection which combines the advantages of sliding-window connected component methods. Characters are detected recognized as image regions contain strokes specific orientations in relative position, where efficiently by convolving gradient field with set oriented bar filters. Additionally, representation calculated from values obtained stroke phase introduced....

10.1109/iccv.2013.19 article EN 2013-12-01

An end-to-end real-time text localization and recognition method is presented. Its performance achieved by posing the character detection segmentation problem as an efficient sequential selection from set of Extremal Regions. The ER detector robust against blur, low contrast illumination, color texture variation. In first stage, probability each being a estimated using features calculated novel algorithm in constant time only ERs with locally maximal are selected for second where...

10.1109/tpami.2015.2496234 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2015-10-30

An efficient method for text localization and recognition in real-world images is proposed. Thanks to effective pruning, it able exhaustively search the space of all character sequences real time (200ms on a 640x480 image). The exploits higher-order properties such as word lines. We demonstrate that grouping stage plays key role performance robust precise compensate errors detector. includes novel selector Maximally Stable Extremal Regions (MSER) which region topology. Experimental...

10.1109/icdar.2011.144 article EN International Conference on Document Analysis and Recognition 2011-09-01

We propose a novel easy-to-implement stroke detector based on an efficient pixel intensity comparison to surrounding pixels. Stroke-specific keypoints are efficiently detected and text fragments subsequently extracted by local thresholding guided keypoint properties. Classification effectively calculated features then eliminates non-text regions. The stroke-specific produce 2 times less region segmentations still detect 25% more characters than the commonly exploited MSER process is 4...

10.1109/iccv.2015.143 article EN 2015-12-01

An end-to-end real-time scene text localization and recognition method is presented. The three main novel features are: (i) keeping multiple segmentations of each character until the very last stage processing when context in a line known, (ii) an efficient algorithm for selection minimizing global criterion, (iii) showing that, despite using theoretically scale-invariant methods, operating on coarse Gaussian scale space pyramid yields improved results as many typographical artifacts are...

10.1109/icdar.2013.110 article EN 2013-08-01

An unconstrained end-to-end text localization and recognition method is presented. The detects initial hypothesis in a single pass by an efficient region-based subsequently refines the using more robust local model, which deviates from common assumption of methods that all characters are detected as connected components.

10.1109/icdar.2015.7333861 article EN 2015-08-01

Detecting the three-dimensional position and orientation of objects using a single RGB camera is foundational task in computer vision with many important applications. Traditionally, 3D object detection methods are trained fully-supervised setup, requiring vast amounts human annotations, which laborious, costly, do not scale well ever-increasing data being captured. In this paper, we present first method to train detectors for monocular cameras without domain-specific thus making orders...

10.48550/arxiv.2501.09481 preprint EN arXiv (Cornell University) 2025-01-16

Deep learning has revolutionized computer vision, but it achieved its tremendous success using deep network architectures which are mostly hand-crafted and therefore likely suboptimal. Neural Architecture Search (NAS) aims to bridge this gap by following a well-defined optimization paradigm systematically looks for the best architecture, given objective criterion such as maximal classification accuracy. The main limitation of NAS is however astronomical computational cost, typically requires...

10.48550/arxiv.2502.04975 preprint EN arXiv (Cornell University) 2025-02-07

Predicting future pedestrian trajectory is a crucial component of autonomous driving systems, as recognizing critical situations based only on current position may come too late for any meaningful corrective action (e.g. breaking) to take place. In this paper, we propose new method predict pedestrians, with respect predicted the ego-vehicle, thus giving assistive/autonomous system sufficient time respond. The explicitly disentangles actual movement pedestrians in real world from ego-motion...

10.1109/cvpr46437.2021.01007 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

We consider the problem of future event prediction in video: if and when a will occur. To this end, we propose number representations loss functions tailored to problem. These include several probabilistic formulations that also model uncertainty prediction. train evaluate approach on two entirely different scenarios: car stop BDD100k driving dataset; player is going shoot basketball towards basket NCAA dataset. show (i) are able predict events far future, up 10 seconds before they occur;...

10.1109/cvprw.2019.00354 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2019-06-01

Recent advances in self-supervised learning havedemonstrated that it is possible to learn accurate monoculardepth reconstruction from raw video data, without using any 3Dground truth for supervision. However, robotics applications,multiple views of a scene may or not be available, depend-ing on the actions robot, switching between monocularand multi-view reconstruction. To address this mixed setting,we proposed new approach extends off-the-shelfself-supervised monocular depth system usemore...

10.48550/arxiv.2004.05821 preprint EN other-oa arXiv (Cornell University) 2020-01-01

We present a system for automatic converting of 2D mask object predictions and raw LiDAR point clouds into full 3D bounding boxes objects. Because the are partial, directly fitting to is meaningless. Instead, we suggest that obtaining good results requires sharing information between all objects in dataset jointly, over multiple frames. then make three improvements baseline. First, address ambiguities predicting rotations via direct optimization this space while still backpropagating...

10.1109/icra46639.2022.9811693 article EN 2022 International Conference on Robotics and Automation (ICRA) 2022-05-23

The axial-injection end-burning hybrid rocket proposed twenty years ago by the authors recently recaptured attention of researchers for its virtues such as no <TEX>${\zeta}$</TEX> (oxidizer to fuel mass ratio) shift during firing and good throttling characteristics. This paper is first report verifying these using a laboratory scale motor. There are several requirements realizing this type rocket: 1) high filling rate obtaining an optimal <TEX>${\zeta}$</TEX>; 2) small port intervals...

10.12989/aas.2017.4.3.281 article EN Advances in aircraft and spacecraft science 2017-05-23

Deep Learning-based approaches for 3d object detection and 6d pose estimation typically require large amounts of labeled training data. Labeling image data is expensive particularly the information difficult to obtain, as it requires a complex setup during acquisition. Training with synthetic therefore very attractive. Large synthetic, can be generated, but not yet fully understood how certain aspects generation affect performance. Our work focuses on creating investigating effects We...

10.1109/etfa.2019.8869318 article EN 2019-09-01
Coming Soon ...