- Advanced Vision and Imaging
- 3D Surveying and Cultural Heritage
- Robotics and Sensor-Based Localization
- Advanced Image and Video Retrieval Techniques
- Computer Graphics and Visualization Techniques
- Remote Sensing and LiDAR Applications
- 3D Shape Modeling and Analysis
- Optical measurement and interference techniques
- Indoor and Outdoor Localization Technologies
- Generative Adversarial Networks and Image Synthesis
- Visual Attention and Saliency Detection
- Advanced Neural Network Applications
- Speech and Audio Processing
- Image Enhancement Techniques
- Image Processing and 3D Reconstruction
- Video Surveillance and Tracking Methods
- Automated Road and Building Extraction
- Human Motion and Animation
- Image and Object Detection Techniques
- Advanced Numerical Analysis Techniques
- Manufacturing Process and Optimization
- Medical Image Segmentation Techniques
- Underwater Vehicles and Communication Systems
- Handwritten Text Recognition Techniques
- Museums and Cultural Heritage
Simon Fraser University
2017-2024
Autodesk (Canada)
2024
Google (United States)
2009-2021
Seattle University
2009-2021
Washington University in St. Louis
2014-2017
University of Washington
2009-2013
Stanford University
2013
Cornell University
2010
University of Illinois Urbana-Champaign
2003-2008
This paper proposes a novel algorithm for multiview stereopsis that outputs dense set of small rectangular patches covering the surfaces visible in images. Stereopsis is implemented as match, expand, and filter procedure, starting from sparse matched keypoints, repeatedly expanding these before using visibility constraints to away false matches. The keys performance proposed are effective techniques enforcing local photometric consistency global constraints. Simple but methods also turn...
We present a system that can reconstruct 3D geometry from large, unorganized collections of photographs such as those found by searching for given city (e.g., Rome) on Internet photo-sharing sites. Our is built set new, distributed computer vision algorithms image matching and reconstruction, designed to maximize parallelism at each stage the pipeline scale gracefully with both size problem amount available computation. experimental results demonstrate it now possible city-scale more than...
This paper introduces an approach for enabling existing multi-view stereo methods to operate on extremely large unstructured photo collections. The main idea is decompose the collection into a set of overlapping sets photos that can be processed in parallel, and merge resulting reconstructions. clustering problem formulated as constrained optimization solved iteratively. merging algorithm, designed parallel out-of-core, incorporates robust filtering steps eliminate low-quality...
This paper proposes a novel algorithm for calibrated multi-view stereopsis that outputs (quasi) dense set of rectangular patches covering the surfaces visible in input images. does not require any initialization form bounding volume, and it detects discards automatically outliers obstacles. It perform smoothing across nearby features, yet is currently top performer terms both coverage accuracy four six benchmark datasets presented [20]. The keys to its performance are effective techniques...
Multi-view stereo (MVS) algorithms now produce reconstructions that rival laser range scanner accuracy. However, require textured surfaces, and therefore work poorly for many architectural scenes (e.g., building interiors with textureless, painted walls). This paper presents a novel MVS approach to overcome these limitations Manhattan World scenes, i.e., consists of piece-wise planar surfaces dominant directions. Given set calibrated photographs, we first reconstruct regions using an...
This paper proposes a fully automated 3D reconstruction and visualization system for architectural scenes (interiors exteriors). The of indoor environments from photographs is particularly challenging due to texture-poor planar surfaces such as uniformly-painted walls. Our first uses structure-from-motion, multi-view stereo, stereo algorithm specifically designed Manhattan-world (scenes consisting predominantly piece-wise with dominant directions) calibrate the cameras recover initial...
This paper proposes a deep neural architecture, PlaneRCNN, that detects and reconstructs piecewise planar regions from single RGB image. PlaneRCNN employs variant of Mask R-CNN to detect planes with their plane parameters segmentation masks. then refines an arbitrary number masks novel loss enforcing the consistency nearby view during training. The also presents new benchmark more fine-grained segmentations in ground-truth, which, outperforms existing state-of-the-art methods significant...
This paper proposes a deep neural network (DNN) for piece-wise planar depthmap reconstruction from single RGB image. While DNNs have brought remarkable progress to single-image depth prediction, requires structured geometry representation, and has been difficult task master even DNNs. The proposed end-to-end DNN learns directly infer set of plane parameters corresponding segmentation masks We generated more than 50,000 depthmaps training testing ScanNet, large-scale RGBD video database. Our...
This paper addresses the problem of converting a rasterized floorplan image into vector-graphics representation. Unlike existing approaches that rely on sequence lowlevel processing heuristics, we adopt learning-based approach. A neural architecture first transforms to set junctions represent low-level geometric and semantic information (e.g., wall corners or door end-points). Integer programming is then formulated aggregate simple primitives lines, icon boxes) produce vectorized floorplan,...
This paper sets a new foundation for data-driven inertial navigation research, where the task is estimation of horizontal positions and heading direction moving subject from sequence IMU sensor measurements phone. In contrast to existing methods, our method can handle varying phone orientations placements.More concretely, presents 1) benchmark containing more than 40 hours data 100 human subjects with ground-truth 3D trajectories under natural motions; 2) novel neural architectures, making...
This paper presents a novel 3D modeling framework that reconstructs an indoor scene as structured model from panorama RGBD images. A geometry is represented graph, where nodes correspond to structural elements such rooms, walls, and objects. The approach devises structure grammar defines how graph can be manipulated. then drives principled new reconstruction algorithm, the rules are sequentially applied recover model. also proposes room segmentation algorithm offset-map used in enforce...
This paper presents a system to reconstruct piecewise planar and compact floorplans from images, which are then converted high quality texture-mapped models for free- viewpoint visualization. There two main challenges in image-based floorplan reconstruction. The first is the lack of 3D information that can be extracted images by Structure Motion Multi-View Stereo, as indoor scenes abound with non-diffuse homogeneous surfaces plus clutter. second challenge need sophisticated regularization...
We present the first large scale system for capturing and rendering relight able scene reconstructions from massive unstructured photo collections taken under different illumination conditions viewpoints. combine photos many sources, Flickr-Based ground-level imagery, oblique aerial views, street view, to recover models that are significantly more complete detailed than previously demonstrated. demonstrate ability match both viewpoint of arbitrary input photos, enabling a Visual Turing Test...
We address the problem of geo-registering ground-based multi-view stereo models by ground-to-aerial image matching. The main contribution is a fully automated geo-registration pipeline with novel viewpoint-dependent matching method that handles ground to aerial viewpoint variation. conduct large-scale experiments which consist many popular outdoor landmarks in Rome. proposed approach demonstrates high success rate for task, and dramatically outperforms state-of-the-art techniques, yielding...
We propose a new approach for 3D instance segmentation based on sparse convolution and point affinity prediction, which indicates the likelihood of two points belonging to same instance. The proposed network, built upon submanifold [3], processes voxelized cloud predicts semantic scores each occupied voxel as well between neighboring voxels at different scales. A simple yet effective clustering algorithm segments into instances predicted mesh topology. is determined by prediction....
This paper proposes a new approach for automated floorplan reconstruction from RGBD scans, major milestone in indoor mapping research. The approach, dubbed Floor-SP, formulates novel optimization problem, where room-wise coordinate descent sequentially solves shortest path problems to optimize the graph structure. objective function consists of data terms guided by deep neural networks, consistency encouraging adjacent rooms share corners and walls, model complexity term. does not require...
This paper presents a scene agnostic neural architecture for camera localization, where model parameters and scenes are independent from each other.Despite recent advancement in learning based methods, most approaches require training one by one, not applicable online applications such as SLAM robotic navigation, must be built on-the-fly.Our approach learns to build hierarchical representation predicts dense coordinate map of query RGB image on-the-fly given an arbitrary scene. The 6D pose...
This paper proposes a generative adversarial layout refinement network for automated floorplan generation. Our architecture is an integration of graph-constrained relational GAN and conditional GAN, where previously generated becomes the next input constraint, enabling iterative refinement. A surprising discovery our research that simple non-iterative training process, dubbed component-wise GT-conditioning, effective in learning such generator. The generator further allows us to improve...