Wanshui Gan

ORCID: 0000-0002-6720-6500
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Advanced Vision and Imaging
  • Video Surveillance and Tracking Methods
  • Robotics and Sensor-Based Localization
  • Computer Graphics and Visualization Techniques
  • Robot Manipulation and Learning
  • Advanced Image and Video Retrieval Techniques
  • Advanced Image Processing Techniques
  • Manufacturing Process and Optimization
  • 3D Shape Modeling and Analysis
  • Gait Recognition and Analysis
  • Advanced Optical Sensing Technologies
  • Autonomous Vehicle Technology and Safety
  • Retinal Imaging and Analysis
  • Automotive and Human Injury Biomechanics
  • Image Processing Techniques and Applications
  • Image Enhancement Techniques
  • Image and Object Detection Techniques

The University of Tokyo
2022-2024

RIKEN Center for Advanced Intelligence Project
2023-2024

Shenzhen Institutes of Advanced Technology
2022-2023

University of Macau
2021-2023

Chinese Academy of Sciences
2022-2023

Neural radiance fields have made a remarkable breakthrough in the novel view synthesis task at 3D static scene. However, for 4D circumstance (e.g., dynamic scene), performance of existing method is still limited by capacity neural network, typically multilayer perceptron network (MLP). In this article, we utilize Voxel to model field, short as V4D, where voxel has two formats. The first one regularly space and then use sampled local feature with time index density field texture tiny MLP....

10.1109/tvcg.2023.3312127 article EN IEEE Transactions on Visualization and Computer Graphics 2023-09-05

In this paper, a computation efficient regression framework is presented for estimating the 6D pose of rigid objects from single RGB-D image, which applicable to handling symmetric objects. This designed in simple architecture that efficiently extracts point-wise features data using fully convolutional network, called XYZNet, and directly regresses without any post refinement. case object, one object has multiple ground-truth poses, one-to-many relationship may lead estimation ambiguity....

10.1109/cvpr52688.2022.00660 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

3D occupancy estimation from surrounding-view images is an exciting task in autonomous driving, following the success of Bird's Eye View (BEV) perception. In this work, we present a comprehensive framework for estimation, which reveals several key components such as network design, optimization, and evaluation. addition, explore relationship between other related tasks, monocular depth reconstruction, could advance study perception driving. For evaluation, propose simple sampling strategy to...

10.1109/tiv.2024.3403134 article EN IEEE Transactions on Intelligent Vehicles 2024-01-01

We introduce GaussianOcc, a systematic method that investigates the two usages of Gaussian splatting for fully self-supervised and efficient 3D occupancy estimation in surround views. First, traditional methods still require ground truth 6D poses from sensors during training. To address this limitation, we propose Splatting Projection (GSP) module to provide accurate scale information training adjacent view projection. Additionally, existing rely on volume rendering final voxel...

10.48550/arxiv.2408.11447 preprint EN arXiv (Cornell University) 2024-08-21

The task of estimating 3D occupancy from surrounding-view images is an exciting development in the field autonomous driving, following success Bird's Eye View (BEV) perception. This provides crucial attributes driving environment, enhancing overall understanding and perception surrounding space. In this work, we present a simple framework for estimation, which CNN-based designed to reveal several key factors such as network design, optimization, evaluation. addition, explore relationship...

10.48550/arxiv.2303.10076 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Neural radiance fields have made a remarkable breakthrough in the novel view synthesis task at 3D static scene. However, for 4D circumstance (e.g., dynamic scene), performance of existing method is still limited by capacity neural network, typically multilayer perceptron network (MLP). In this paper, we utilize Voxel to model field, short as V4D, where voxel has two formats. The first one regularly space and then use sampled local feature with time index density field texture tiny MLP....

10.48550/arxiv.2205.14332 preprint EN other-oa arXiv (Cornell University) 2022-01-01

In this paper, a computation efficient regression framework is presented for estimating the 6D pose of rigid objects from single RGB-D image, which applicable to handling symmetric objects. This designed in simple architecture that efficiently extracts point-wise features data using fully convolutional network, called XYZNet, and directly regresses without any post refinement. case object, one object has multiple ground-truth poses, one-to-many relationship may lead estimation ambiguity....

10.48550/arxiv.2204.01080 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Accurate height estimation from monocular aerial imagery presents a significant challenge due to its inherently ill-posed nature. This limitation is rooted in the absence of adequate geometric constraints available model when training with imagery. Without additional information supplement image data, model's ability provide reliable estimations compromised. In this paper, we propose method that enhances by incorporating street-view images. Our insight images distinct viewing perspective and...

10.48550/arxiv.2311.02121 preprint EN other-oa arXiv (Cornell University) 2023-01-01
Coming Soon ...