NFDI4DS | UHH-SEMS - Publication Details

Recurrent MVSNet for High-Resolution Multi-View Stereo Depth Inference

OPENALEX - Publications

Yao Yao Zixin Luo Shiwei Li Tianwei Shen Tian Fang and 1 more

Deep learning has recently demonstrated its excellent performance for multi-view stereo (MVS). However, one major limitation of current learned MVS approaches is the scalability: memory-consuming cost volume regularization makes hard to be applied high-resolution scenes. In this paper, we introduce a scalable framework based on recurrent neural network. Instead regularizing entire 3D in go, proposed Recurrent Multi-view Stereo Network (R-MVSNet) sequentially regularizes 2D maps along depth...

10.1109/cvpr.2019.00567 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Learning Two-View Correspondences and Geometry Using Order-Aware Network

OPENALEX - Publications

Jiahui Zhang Dawei Sun Zixin Luo Anbang Yao Lei Zhou and 4 more

Establishing correspondences between two images requires both local and global spatial context. Given putative of feature points in views, this paper, we propose Order-Aware Network, which infers the probabilities being inliers regresses relative pose encoded by essential matrix. Specifically, proposed network is built hierarchically comprises three novel operations. First, to capture context sparse correspondences, clusters unordered input learning a soft assignment These are canonical...

10.1109/iccv.2019.00594 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

ContextDesc: Local Descriptor Augmentation With Cross-Modality Context

OPENALEX - Publications

Zixin Luo Tianwei Shen Lei Zhou Jiahui Zhang Yao Yao and 3 more

Most existing studies on learning local features focus the patch-based descriptions of individual keypoints, whereas neglecting spatial relations established from their keypoint locations. In this paper, we go beyond detail representation by introducing context awareness to augment off-the-shelf feature descriptors. Specifically, propose a unified framework that leverages and aggregates cross-modality contextual information, including (i) visual high-level image representation, (ii)...

10.1109/cvpr.2019.00263 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Very Large-Scale Global SfM by Distributed Motion Averaging

OPENALEX - Publications

Siyu Zhu Runze Zhang Lei Zhou Tianwei Shen Tian Fang and 2 more

Global Structure-from-Motion (SfM) techniques have demonstrated superior efficiency and accuracy than the conventional incremental approach in many recent studies. This work proposes a divide-and-conquer framework to solve very large global SfM at scale of millions images. Specifically, we first divide all images into multiple partitions that preserve strong data association for well-posed parallel local motion averaging. Then, averaging determines cameras partition boundaries similarity...

10.1109/cvpr.2018.00480 article EN 2018-06-01

Joint Semantic Segmentation and Boundary Detection Using Iterative Pyramid Contexts

OPENALEX - Publications

Mingmin Zhen Jinglu Wang Lei Zhou Shiwei Li Tianwei Shen and 3 more

In this paper, we present a joint multi-task learning framework for semantic segmentation and boundary detection. The critical component in the is iterative pyramid context module (PCM), which couples two tasks stores shared latent semantics to interact between tasks. For detection, propose novel spatial gradient fusion suppress non-semantic edges. As detection dual task of segmentation, introduce loss function with consistency constraint improve pixel accuracy segmentation. Our extensive...

10.1109/cvpr42600.2020.01368 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Beyond Photometric Loss for Self-Supervised Ego-Motion Estimation

OPENALEX - Publications

Tianwei Shen Zixin Luo Lei Zhou Hanyu Deng Runze Zhang and 2 more

Accurate relative pose is one of the key components in visual odometry (VO) and simultaneous localization mapping (SLAM). Recently, self-supervised learning framework that jointly optimizes target image depth has attracted attention community. Previous works rely on photometric error generated from depths poses between adjacent frames, which contains large systematic under realistic scenes due to reflective surfaces occlusions. In this paper, we bridge gap geometric loss by introducing...

10.1109/icra.2019.8793479 preprint EN 2022 International Conference on Robotics and Automation (ICRA) 2019-05-01

KFNet: Learning Temporal Camera Relocalization Using Kalman Filtering

OPENALEX - Publications

Lei Zhou Zixin Luo Tianwei Shen Jiahui Zhang Mingmin Zhen and 3 more

Temporal camera relocalization estimates the pose with respect to each video frame in sequence, as opposed one-shot which focuses on a still image. Even though time dependency has been taken into account, current temporal methods generally underperform state-of-the-art approaches terms of accuracy. In this work, we improve method by using network architecture that incorporates Kalman filtering (KFNet) for online relocalization. particular, KFNet extends scene coordinate regression problem...

10.1109/cvpr42600.2020.00497 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Progressive Large Scale-Invariant Image Matching in Scale Space

OPENALEX - Publications

Lei Zhou Siyu Zhu Tianwei Shen Jinglu Wang Tian Fang and 1 more

The power of modern image matching approaches is still fundamentally limited by the abrupt scale changes in images. In this paper, we propose a scale-invariant approach to tackling very large variation views. Drawing inspiration from space theory, start with encoding image's into compact multi-scale representation. Then, rather than trying find exact feature matches all one step, progressive two-stage approach. First, determine related levels space, enclosing inlier correspondences, based on...

10.1109/iccv.2017.259 article EN 2017-10-01

Parallel Structure from Motion from Local Increment to Global Averaging

OPENALEX - Publications

Siyu Zhu Tianwei Shen Lei Zhou Runze Zhang Jinglu Wang and 2 more

In this paper, we tackle the accurate and consistent Structure from Motion (SfM) problem, in particular camera registration, far exceeding memory of a single computer parallel. Different previous methods which drastically simplify parameters SfM sacrifice accuracy final reconstruction, try to preserve connectivities among cameras by proposing clustering algorithm divide large problem into smaller sub-problems terms clusters with overlapping. We then exploit hybrid formulation that applies...

10.48550/arxiv.1702.08601 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Learning Two-View Correspondences and Geometry Using Order-Aware Network

OPENALEX - Publications

Jiahui Zhang Dawei Sun Zixin Luo Anbang Yao Lei Zhou and 4 more

Establishing correspondences between two images requires both local and global spatial context. Given putative of feature points in views, this paper, we propose Order-Aware Network, which infers the probabilities being inliers regresses relative pose encoded by essential matrix. Specifically, proposed network is built hierarchically comprises three novel operations. First, to capture context sparse correspondences, clusters unordered input learning a soft assignment These are canonical...

10.48550/arxiv.1908.04964 preprint EN other-oa arXiv (Cornell University) 2019-01-01

NinjaDesc: Content-Concealing Visual Descriptors via Adversarial Learning

OPENALEX - Publications

Tony Ng Hyo Jin Kim Vincent T. Lee Daniel DeTone Tsun-Yi Yang and 5 more

In the light of recent analyses on privacy-concerning scene revelation from visual descriptors, we develop descriptors that conceal input image content. particular, propose an adversarial learning framework for training prevent reconstruction, while maintaining matching accuracy. We let a feature encoding network and reconstruction compete with each other, such encoder tries to impede its generated reconstructor recover descriptors. The experimental results demonstrate obtained our method...

10.1109/cvpr52688.2022.01246 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

OANet: Learning Two-View Correspondences and Geometry Using Order-Aware Network

OPENALEX - Publications

Jiahui Zhang Dawei Sun Zixin Luo Anbang Yao Hongkai Chen and 5 more

Establishing correct correspondences between two images should consider both local and global spatial context. Given putative of feature points in views, this paper, we propose Order-Aware Network, which infers the probabilities being inliers regresses relative pose encoded by essential or fundamental matrix. Specifically, proposed network is built hierarchically comprises three operations. First, to capture context sparse correspondences, clusters unordered input learning a soft assignment...

10.1109/tpami.2020.3048013 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2020-12-29

Distributed Very Large Scale Bundle Adjustment by Global Camera Consensus

OPENALEX - Publications

Runze Zhang Siyu Zhu Tianwei Shen Lei Zhou Zixin Luo and 2 more

The increasing scale of Structure-from-Motion is fundamentally limited by the conventional optimization framework for all-in-one global bundle adjustment. In this paper, we propose a distributed approach to coping with adjustment very large computation. First, derive formulation from classical algorithm ADMM, Alternating Direction Method Multipliers, based on camera consensus. Then, analyze conditions under which convergence would be guaranteed. particular, adopt over-relaxation and...

10.1109/tpami.2018.2840719 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2018-05-25

Self-Supervised Learning of Depth and Motion Under Photometric Inconsistency

OPENALEX - Publications

Tianwei Shen Lei Zhou Zixin Luo Yao Yao Shiwei Li and 3 more

The self-supervised learning of depth and pose from monocular sequences provides an attractive solution by using the photometric consistency nearby frames as it depends much less on ground-truth data. In this paper, we address issue when previous assumptions approaches are violated due to dynamic nature real-world scenes. Different handling noise uncertainty, our key idea is incorporate more robust geometric quantities enforce internal in temporal image sequence. As demonstrated commonly...

10.1109/iccvw.2019.00499 preprint EN 2019-10-01

Recurrent MVSNet for High-resolution Multi-view Stereo Depth Inference

OPENALEX - Publications

Yao Yao Zixin Luo Shiwei Li Tianwei Shen Tian Fang and 1 more

Deep learning has recently demonstrated its excellent performance for multi-view stereo (MVS). However, one major limitation of current learned MVS approaches is the scalability: memory-consuming cost volume regularization makes hard to be applied high-resolution scenes. In this paper, we introduce a scalable framework based on recurrent neural network. Instead regularizing entire 3D in go, proposed Recurrent Multi-view Stereo Network (R-MVSNet) sequentially regularizes 2D maps along depth...

10.48550/arxiv.1902.10556 preprint EN other-oa arXiv (Cornell University) 2019-01-01

ContextDesc: Local Descriptor Augmentation with Cross-Modality Context

OPENALEX - Publications

Zixin Luo Tianwei Shen Lei Zhou Jiahui Zhang Yao Yao and 3 more

Most existing studies on learning local features focus the patch-based descriptions of individual keypoints, whereas neglecting spatial relations established from their keypoint locations. In this paper, we go beyond detail representation by introducing context awareness to augment off-the-shelf feature descriptors. Specifically, propose a unified framework that leverages and aggregates cross-modality contextual information, including (i) visual high-level image representation, (ii)...

10.48550/arxiv.1904.04084 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Cross-Atlas Convolution for Parameterization Invariant Learning on Textured Mesh Surface

OPENALEX - Publications

Shiwei Li Zixin Luo Mingmin Zhen Yao Yao Tianwei Shen and 2 more

We present a convolutional network architecture for direct feature learning on mesh surfaces through their atlases of texture maps. The map encodes the parameterization from 3D to 2D domain, rendering not only RGB values but also rasterized geometric features if necessary. Since is pre-determined, and depends surface topologies, we therefore introduce novel cross-atlas convolution recover original geodesic neighborhood, so as achieve invariance property arbitrary parameterization. proposed...

10.1109/cvpr.2019.00630 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Joint Semantic Segmentation and Boundary Detection using Iterative Pyramid Contexts

OPENALEX - Publications

Mingmin Zhen Jinglu Wang Lei Zhou Shiwei Li Tianwei Shen and 3 more

In this paper, we present a joint multi-task learning framework for semantic segmentation and boundary detection. The critical component in the is iterative pyramid context module (PCM), which couples two tasks stores shared latent semantics to interact between tasks. For detection, propose novel spatial gradient fusion suppress nonsemantic edges. As detection dual task of segmentation, introduce loss function with consistency constraint improve pixel accuracy segmentation. Our extensive...

10.48550/arxiv.2004.07684 preprint EN other-oa arXiv (Cornell University) 2020-01-01