Sinisa Stekovic

ORCID: 0000-0003-0976-494X
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Vision and Imaging
  • 3D Shape Modeling and Analysis
  • 3D Surveying and Cultural Heritage
  • Remote Sensing and LiDAR Applications
  • Computer Graphics and Visualization Techniques
  • Advanced Neural Network Applications
  • Robotics and Sensor-Based Localization
  • Human Pose and Action Recognition
  • Image Processing and 3D Reconstruction
  • Machine Learning and Data Classification
  • Multimodal Machine Learning Applications
  • Domain Adaptation and Few-Shot Learning
  • Scientific Computing and Data Management
  • Advanced Image and Video Retrieval Techniques

Graz University of Technology
2019-2024

We propose a novel method for reconstructing floor plans from noisy 3D point clouds. Our main contribution is principled approach that relies on the Monte Carlo Tree Search (MCTS) algorithm to maximize suitable objective function efficiently despite complexity of problem. Like previous work, we first project input cloud top view create density map and extract room proposals it. selects optimizes polygonal shapes these jointly fit outputs an accurate vectorized even large complex scenes. To...

10.1109/iccv48922.2021.01573 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

We explore how a general AI algorithm can be used for 3D scene understanding to reduce the need training data. More exactly, we propose modification of Monte Carlo Tree Search (MCTS) retrieve objects and room layouts from noisy RGB-D scans. While MCTS was developed as game-playing algorithm, show it also complex perception problems. Our adapted has few easy-to-tune hyperparameters optimise losses. use posterior probability layout hypotheses given This results in an analysis-by-synthesis...

10.1109/cvpr46437.2021.01359 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

We present an automatic method for annotating images of indoor scenes with the CAD models objects by relying on RGB-D scans. Through a visual evaluation 3D experts, we show that our retrieves annotations are at least as accurate manual annotations, and can thus be used ground truth without burden manually data. do this using analysis-by-synthesis approach, which compares renderings captured scene. introduce 'cloning procedure' identifies have same geometry, to annotate these models. This...

10.1109/wacv56688.2023.00317 article EN 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023-01-01

We present an automated and efficient approach for retrieving high-quality CAD models of objects their poses in a scene captured by moving RGB-D camera. first investigate various objective functions to measure similarity between candidate object model the available data, best function appears be "render-and-compare" method comparing depth mask rendering. thus introduce fast-search that approximates exhaustive search based on this simultaneously category, model, pose given approximate 3D...

10.1109/3dv62453.2024.00066 article EN 2021 International Conference on 3D Vision (3DV) 2024-03-18

We propose a simple yet effective method to learn segment new indoor scenes from video frames: State-of- the-art methods trained on one dataset, even as large the SUNRGB-D can perform poorly when applied images that are not part of because dataset bias, common phenomenon in computer vision. To make semantic segmentation more useful practice, exploit geometric constraints. Our main contribution is show these constraints be cast conveniently semi-supervised terms, which enforce fact same class...

10.1109/wacv45572.2020.9093571 article EN 2020-03-01

We propose a novel method applicable in many scene understanding problems that adapts the Monte Carlo Tree Search (MCTS) algorithm, originally designed to learn play games of high-state complexity. From generated pool proposals, our jointly selects and optimizes proposals minimize objective term. In first application for floor plan reconstruction from point clouds, refines room modelled as 2D polygons, by optimizing on an function combining fitness predicted deep network regularizing terms...

10.1109/tpami.2022.3203729 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2022-09-26

We propose PyTorchGeoNodes, a differentiable module for reconstructing 3D objects from images using interpretable shape programs. In comparison to traditional CAD model retrieval methods, the use of programs reconstruction allows reasoning about semantic properties reconstructed objects, editing, low memory footprint, etc. However, utilization scene understanding has been largely neglected in past works. As our main contribution, we enable gradient-based optimization by introducing that...

10.48550/arxiv.2404.10620 preprint EN arXiv (Cornell University) 2024-04-16

We show that it is possible to learn semantic segmentation from very limited amounts of manual annotations, by enforcing geometric 3D constraints between multiple views. More exactly, image locations corresponding the same physical point should all have label. introducing such during learning effective, even when no label available for a point, and can be done simply employing techniques 'general' semi-supervised context segmentation. To demonstrate this idea, we use RGB-D sequences rigid...

10.48550/arxiv.1812.10717 preprint EN other-oa arXiv (Cornell University) 2018-01-01

We present an automated and efficient approach for retrieving high-quality CAD models of objects their poses in a scene captured by moving RGB-D camera. first investigate various objective functions to measure similarity between candidate object model the available data, best function appears be "render-and-compare" method comparing depth mask rendering. thus introduce fast-search that approximates exhaustive search based on this simultaneously category, model, pose given approximate 3D...

10.48550/arxiv.2309.06107 preprint EN other-oa arXiv (Cornell University) 2023-01-01

We propose a simple yet effective method to learn segment new indoor scenes from video frames: State-of-the-art methods trained on one dataset, even as large the SUNRGB-D can perform poorly when applied images that are not part of because dataset bias, common phenomenon in computer vision. To make semantic segmentation more useful practice, exploit geometric constraints. Our main contribution is show these constraints be cast conveniently semi-supervised terms, which enforce fact same class...

10.48550/arxiv.1904.12534 preprint EN other-oa arXiv (Cornell University) 2019-01-01

We present a novel method to reconstruct the 3D layout of room (walls, floors, ceilings) from single perspective view in challenging conditions, by contrast with previous single-view methods restricted cuboid-shaped layouts. This input can consist color image only, but considering depth map results more accurate reconstruction. Our approach is formalized as solving constrained discrete optimization problem find set polygons that constitute layout. In order deal occlusions between components...

10.48550/arxiv.2001.02149 preprint EN other-oa arXiv (Cornell University) 2020-01-01

We present MonteBoxFinder, a method that, given noisy input point cloud, fits cuboids to the scene. Our primary contribution is discrete optimization algorithm from dense set of initially detected cuboids, able efficiently filter good boxes ones. Inspired by recent applications MCTS scene understanding problems, we develop stochastic that is, design, more efficient for our task. Indeed, quality fit cuboid arrangement invariant order in which are added into several search baselines problem...

10.48550/arxiv.2207.14268 preprint EN other-oa arXiv (Cornell University) 2022-01-01

We present an automatic method for annotating images of indoor scenes with the CAD models objects by relying on RGB-D scans. Through a visual evaluation 3D experts, we show that our retrieves annotations are at least as accurate manual annotations, and can thus be used ground truth without burden manually data. do this using analysis-by-synthesis approach, which compares renderings captured scene. introduce 'cloning procedure' identifies have same geometry, to annotate these models. This...

10.48550/arxiv.2212.11796 preprint EN other-oa arXiv (Cornell University) 2022-01-01

We propose a novel method applicable in many scene understanding problems that adapts the Monte Carlo Tree Search (MCTS) algorithm, originally designed to learn play games of high-state complexity. From generated pool proposals, our jointly selects and optimizes proposals minimize objective term. In first application for floor plan reconstruction from point clouds, refines room modelled as 2D polygons, by optimizing on an function combining fitness predicted deep network regularizing terms...

10.48550/arxiv.2207.03204 preprint EN other-oa arXiv (Cornell University) 2022-01-01

We propose a novel method for reconstructing floor plans from noisy 3D point clouds. Our main contribution is principled approach that relies on the Monte Carlo Tree Search (MCTS) algorithm to maximize suitable objective function efficiently despite complexity of problem. Like previous work, we first project input cloud top view create density map and extract room proposals it. selects optimizes polygonal shapes these jointly fit outputs an accurate vectorized even large complex scenes. To...

10.48550/arxiv.2103.11161 preprint EN other-oa arXiv (Cornell University) 2021-01-01
Coming Soon ...