NFDI4DS | UHH-SEMS - Publication Details

DUSt3R: Geometric 3D Vision Made Easy

OPENALEX - Publications

Shuzhe Wang Vincent Leroy Yohann Cabon Boris Chidlovskii Jérôme Revaud

10.1109/cvpr52733.2024.01956 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Hierarchical Scene Coordinate Classification and Regression for Visual Localization

OPENALEX - Publications

Xiaotian Li Shuzhe Wang Yi Zhao Jakob Verbeek Juho Kannala

Visual localization is critical to many applications in computer vision and robotics. To address single-image RGB localization, state-of-the-art feature-based methods match local descriptors between a query image pre-built 3D model. Recently, deep neural networks have been exploited regress the mapping raw pixels coordinates scene, thus matching implicitly performed by forward pass through network. However, large ambiguous environment, learning such regression task directly can be difficult...

10.1109/cvpr42600.2020.01200 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer

OPENALEX - Publications

Shuzhe Wang Zakaria Laskar Iaroslav Melekhov Xiaotian Li Yi Zhao and 2 more

Abstract Visual localization is critical to many applications in computer vision and robotics. To address single-image RGB localization, state-of-the-art feature-based methods match local descriptors between a query image pre-built 3D model. Recently, deep neural networks have been exploited regress the mapping raw pixels coordinates scene, thus matching implicitly performed by forward pass through network. However, large ambiguous environment, learning such regression task directly can be...

10.1007/s11263-023-01982-9 article EN cc-by International Journal of Computer Vision 2024-02-06

Effects of metakaolin on sulfate and sulfuric acid resistance of grouting restoration materials

OPENALEX - Publications

Xiaofei Wang Wenwen Wang Qiang Liu Shuzhe Wang Hongjie Luo and 2 more

10.1016/j.conbuildmat.2022.128714 article EN Construction and Building Materials 2022-08-11

Visual Localization via Few-Shot Scene Region Classification

OPENALEX - Publications

Siyan Dong Shuzhe Wang Yixin Zhuang Juho Kannala Marc Pollefeys and 1 more

Visual (re)localization addresses the problem of estimating 6-DoF (Degree Freedom) camera pose a query image captured in known scene, which is key building block many computer vision and robotics applications. Recent advances structure-based localization solve this by memorizing mapping from pixels to scene coordinates with neural networks build 2D-3D correspondences for optimization. However, such memorization requires training amounts posed images each heavy inefficient. On contrary,...

10.1109/3dv57658.2022.00051 article EN 2021 International Conference on 3D Vision (3DV) 2022-09-01

Continual Learning for Image-Based Camera Localization

OPENALEX - Publications

Shuzhe Wang Zakaria Laskar Iaroslav Melekhov Xiaotian Li Juho Kannala

For several emerging technologies such as augmented reality, autonomous driving and robotics, visual localization is a critical component. Directly regressing camera pose/3D scene coordinates from the input image using deep neural networks has shown great potential. However, methods assume stationary data distribution with all scenes simultaneously available during training. In this paper, we approach problem of in continual learning setup – whereby model trained on an incremental manner....

10.1109/iccv48922.2021.00324 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

An automated CNN recommendation system for image classification tasks

OPENALEX - Publications

Shuzhe Wang Sun Li Fan Wang Jun Sun Satoshi Naoi and 5 more

Nowadays the CNN is widely used in practical applications for image classification task. However design of model very professional work and which difficult ordinary users. Besides, even experts CNN, to select an optimal specific task may still need a lot time (to train many different models). In order solve this problem, we proposed automated recommendation system Our able evaluate complexity ability precisely. By using evaluation results, can recommend match perfectly. The process fast...

10.1109/icme.2017.8019347 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2017-07-01

DGC-GNN: Descriptor-free Geometric-Color Graph Neural Network for 2D-3D Matching

OPENALEX - Publications

Shuzhe Wang Juho Kannala Dániel Baráth

Matching 2D keypoints in an image to a sparse 3D point cloud of the scene without requiring visual descriptors has garnered increased interest due its low memory requirements, inherent privacy preservation, and reduced need for expensive model maintenance compared descriptor-based methods. However, existing algorithms often compromise on performance, resulting significant deterioration their counterparts. In this paper, we introduce DGC-GNN, novel algorithm that employs global-to-local Graph...

10.48550/arxiv.2306.12547 preprint EN cc-by arXiv (Cornell University) 2023-01-01

DGC-GNN: Leveraging Geometry and Color Cues for Visual Descriptor-Free 2D-3D Matching

OPENALEX - Publications

Shuzhe Wang Juho Kannala Dániel Baráth

10.1109/cvpr52733.2024.01973 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Digging Into Self-Supervised Learning of Feature Descriptors

OPENALEX - Publications

Iaroslav Melekhov Zakaria Laskar Xiaotian Li Shuzhe Wang Juho Kannala

Fully-supervised CNN-based approaches for learning local image descriptors have shown remarkable results in a wide range of geometric tasks. However, most them require per-pixel ground-truth keypoint correspondence data which is difficult to acquire at scale. To address this challenge, recent weakly-and self-supervised methods can learn feature from relative camera poses or using only synthetic rigid transformations such as homographies. In work, we focus on understanding the limitations...

10.1109/3dv53792.2021.00122 article EN 2021 International Conference on 3D Vision (3DV) 2021-12-01

DUSt3R: Geometric 3D Vision Made Easy

OPENALEX - Publications

Shuzhe Wang Vincent Leroy Yohann Cabon Boris Chidlovskii Jérôme Revaud

Multi-view stereo reconstruction (MVS) in the wild requires to first estimate camera parameters e.g. intrinsic and extrinsic parameters. These are usually tedious cumbersome obtain, yet they mandatory triangulate corresponding pixels 3D space, which is core of all best performing MVS algorithms. In this work, we take an opposite stance introduce DUSt3R, a radically novel paradigm for Dense Unconstrained Stereo Reconstruction arbitrary image collections, i.e. operating without prior...

10.48550/arxiv.2312.14132 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Guiding Local Feature Matching with Surface Curvature

OPENALEX - Publications

Shuzhe Wang Juho Kannala Marc Pollefeys Dániel Baráth

We propose a new method, called curvature similarity extractor (CSE), for improving local feature matching across images. CSE calculates the of 3D surface patch each detected point in viewpoint-invariant manner via fitting quadrics to predicted monocular depth maps. This is then leveraged as an additional signal with off-the-shelf matchers like SuperGlue and LoFTR. Additionally, enables end-to-end joint training by connecting matcher predictor networks. Our experiments demonstrate on...

10.1109/iccv51070.2023.01648 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Insights and Challenges in Correcting Force Field Based Solvation Free Energies Using A Neural Network Potential

OPENALEX - Publications

Johannes Karwounopoulos Zhiyi Wu Sara Tkaczyk Shuzhe Wang Adam L. Baskerville and 5 more

We present a comprehensive study investigating the potential gain in accuracy for calculating absolute solvation free energies (ASFE) using neural network to describe intramolecular energy of solute. calculated ASFE most compounds from FreeSolv database Open Force Field (OpenFF) and compared them earlier results obtained with CHARMM General (CGenFF). By applying nonequilibrium (NEQ) switching approach between molecular mechanics (MM) description (either OpenFF or CGenFF) net (NNP)/MM level...

10.26434/chemrxiv-2023-8jgjq-v2 preprint EN cc-by-nc-nd 2024-03-01

Differentiable Product Quantization for Memory Efficient Camera Relocalization

OPENALEX - Publications

Zakaria Laskar Iaroslav Melekhov Assia Benbihi Shuzhe Wang Juho Kannala

Camera relocalization relies on 3D models of the scene with a large memory footprint that is incompatible budget several applications. One solution to reduce size map compression by removing certain points and descriptor quantization. This achieves high but leads performance drop due information loss. To address trade-off, we train light-weight scene-specific auto-encoder network performs quantization-dequantization in an end-to-end differentiable manner updating both product quantization...

10.48550/arxiv.2407.15540 preprint EN arXiv (Cornell University) 2024-07-22

Gaussian Splatting in Mirrors: Reflection-Aware Rendering via Virtual Camera Optimization

OPENALEX - Publications

Zihan Wang Shuzhe Wang Matias Turkulainen Jing Fang Juho Kannala

Recent advancements in 3D Gaussian Splatting (3D-GS) have revolutionized novel view synthesis, facilitating real-time, high-quality image rendering. However, scenarios involving reflective surfaces, particularly mirrors, 3D-GS often misinterprets reflections as virtual spaces, resulting blurred and inconsistent multi-view rendering within mirrors. Our paper presents a method aimed at obtaining consistent reflection by modelling physically-based cameras. We estimate mirror planes with depth...

10.48550/arxiv.2410.01614 preprint EN arXiv (Cornell University) 2024-10-02

DeSplat: Decomposed Gaussian Splatting for Distractor-Free Rendering

OPENALEX - Publications

Yihao Wang Marcus Klasson Matias Turkulainen Shuzhe Wang Juho Kannala and 1 more

Gaussian splatting enables fast novel view synthesis in static 3D environments. However, reconstructing real-world environments remains challenging as distractors or occluders break the multi-view consistency assumption required for accurate reconstruction. Most existing methods rely on external semantic information from pre-trained models, introducing additional computational overhead pre-processing steps during optimization. In this work, we propose a method, DeSplat, that directly...

10.48550/arxiv.2411.19756 preprint EN arXiv (Cornell University) 2024-11-29

Reloc3r: Large-Scale Training of Relative Camera Pose Regression for Generalizable, Fast, and Accurate Visual Localization

OPENALEX - Publications

Siyan Dong Shuzhe Wang Shaohui Liu Lulu Cai Qingnan Fan and 2 more

Visual localization aims to determine the camera pose of a query image relative database posed images. In recent years, deep neural networks that directly regress poses have gained popularity due their fast inference capabilities. However, existing methods struggle either generalize well new scenes or provide accurate estimates. To address these issues, we present \textbf{Reloc3r}, simple yet effective visual framework. It consists an elegantly designed regression network, and minimalist...

10.48550/arxiv.2412.08376 preprint EN arXiv (Cornell University) 2024-12-11

SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos

OPENALEX - Publications

Yuzheng Liu Siyan Dong Shuzhe Wang Yingda Yin Yanchao Yang and 2 more

In this paper, we introduce \textbf{SLAM3R}, a novel and effective monocular RGB SLAM system for real-time high-quality dense 3D reconstruction. SLAM3R provides an end-to-end solution by seamlessly integrating local reconstruction global coordinate registration through feed-forward neural networks. Given input video, the first converts it into overlapping clips using sliding window mechanism. Unlike traditional pose optimization-based methods, directly regresses pointmaps from images in each...

10.48550/arxiv.2412.09401 preprint EN arXiv (Cornell University) 2024-12-12

HSCNet++: Hierarchical Scene Coordinate Classification and Regression for Visual Localization with Transformer

OPENALEX - Publications

Shuzhe Wang Zakaria Laskar Iaroslav Melekhov Xiaotian Li Yi Zhao and 2 more

Visual localization is critical to many applications in computer vision and robotics. To address single-image RGB localization, state-of-the-art feature-based methods match local descriptors between a query image pre-built 3D model. Recently, deep neural networks have been exploited regress the mapping raw pixels coordinates scene, thus matching implicitly performed by forward pass through network. However, large ambiguous environment, learning such regression task directly can be difficult...

10.48550/arxiv.2305.03595 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Prediction of key chemical parameters based on improved Transformer

OPENALEX - Publications

Shuzhe Wang Huaiyu Sun Y. Wang Xing Luo

It is difficult to build accurate prediction models for complex chemical processes with significant nonlinearity and dynamics based on traditional shallow static models. A model improved Transformer called CNN-Trans proposed address the above challenges. In order improve efficiency of enhance its local feature extraction capability, CNN architecture applied Transformer's architecture: (1) Dilated causal convolution used in embedding layer obtain multi-scale information a larger sensory...

10.1109/iccea58433.2023.10135471 article EN 2023-04-07

Effects of Metakaolin on Sulfate and Sulfuric Acid Resistance of Nhl2-Sac-Wer-Based Grouting Materials

OPENALEX - Publications

Xiaofei Wang Wenwen Wang Qiang Liu Shuzhe Wang Hongjie Luo and 2 more

In this paper, the effect of metakaolin (MK) on durability NHL2-SAC-WER-based grouting materials in 5% sodium sulfate solution and sulfuric acid (PH=1) was investigated. A comprehensive study carried out for attacked samples by measurement compressive strength, X-ray diffraction (XRD), scanning electron microscopy (SEM) etc. The results showed that addition MK could effectively enhance resistance acid. For immersed solution, larger ettringite crystals were formed inside cross-supported with...

10.2139/ssrn.4116221 article EN SSRN Electronic Journal 2022-01-01

Visual Localization via Few-Shot Scene Region Classification

OPENALEX - Publications

Siyan Dong Shuzhe Wang Yixin Zhuang Juho Kannala Marc Pollefeys and 1 more

Visual (re)localization addresses the problem of estimating 6-DoF (Degree Freedom) camera pose a query image captured in known scene, which is key building block many computer vision and robotics applications. Recent advances structure-based localization solve this by memorizing mapping from pixels to scene coordinates with neural networks build 2D-3D correspondences for optimization. However, such memorization requires training amounts posed images each heavy inefficient. On contrary,...

10.48550/arxiv.2208.06933 preprint EN other-oa arXiv (Cornell University) 2022-01-01