Yan Di

ORCID: 0000-0003-0671-8323
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Robotics and Sensor-Based Localization
  • Robot Manipulation and Learning
  • Human Pose and Action Recognition
  • 3D Shape Modeling and Analysis
  • Soft Robotics and Applications
  • Image Processing and 3D Reconstruction
  • Advanced Vision and Imaging
  • Advanced Neural Network Applications
  • Image and Object Detection Techniques
  • 3D Surveying and Cultural Heritage
  • Advanced Radiotherapy Techniques
  • Lung Cancer Diagnosis and Treatment
  • Multimodal Machine Learning Applications
  • Advanced Optical Sensing Technologies
  • Medical Imaging Techniques and Applications
  • Reinforcement Learning in Robotics
  • Robotic Path Planning Algorithms
  • Advanced Image and Video Retrieval Techniques
  • Anatomy and Medical Technology
  • Face recognition and analysis
  • Medical Imaging and Analysis
  • Image Retrieval and Classification Techniques

Technical University of Munich
2021-2024

Directly regressing all 6 degrees-of-freedom (6DoF) for the object pose (i.e. 3D rotation and translation) in a cluttered environment from single RGB image is challenging problem. While end-to-end methods have recently demonstrated promising results at high efficiency, they are still inferior when compared with elaborate PnP/RANSAC-based approaches terms of accuracy. In this work, we address shortcoming by means novel reasoning about self-occlusion, order to establish two-layer...

10.1109/iccv48922.2021.01217 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

While 6D object pose estimation has recently made a huge leap forward, most methods can still only handle single or handful of different objects, which limits their applications. To circumvent this problem, category-level been revamped, aims at predicting the as well 3D metric size for previously unseen instances from given set classes. This is, however, much more challenging task due to severe intra-class shape variations. address issue, we propose GPV-Pose, novel framework robust...

10.1109/cvpr52688.2022.00666 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

6-DoF robotic grasping is a long-lasting but un-solved problem. Recent methods utilize strong 3D networks to extract geometric representations from depth sensors, demonstrating superior accuracy on common objects performing unsatisfactorily photometrically challenging objects, e.g., in transparent or reflective materials. The bottleneck lies that the surface of these can not reflect accurate due absorption refraction light. In this paper, contrast exploiting inaccurate data, we propose first...

10.1109/icra48891.2023.10160779 article EN 2023-05-29

Monocular 3D object detection has recently made a significant leap forward thanks to the use of pre-trained depth estimators for pseudo-LiDAR recovery. Yet, such two-stage methods typically suffer from overfitting and are incapable explicitly encapsulating geometric relation between bounding box. To overcome this limitation, we instead propose jointly estimate dense scene with depth-bounding box residuals boxes, allowing two-stream objects that harnesses both geometry context information....

10.1109/lra.2023.3238137 article EN cc-by IEEE Robotics and Automation Letters 2023-01-19

Category-level pose estimation is a challenging problem due to intra-class shape variations. Recent methods deform pre-computed priors map the observed point cloud into normalized object coordinate space and then retrieve via post-processing, i.e., Umeyama's Algorithm. The shortcomings of this two-stage strategy lie in two aspects: 1) surrogate supervision on intermediate results can not directly guide learning pose, resulting large error after post-processing. 2) inference speed limited by...

10.1109/iros47612.2022.9981506 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022-10-23

Category-level 6D object pose estimation aims at determining the of an a given category. Most current state-of-the-art methods require significant amount real training data to supervise their models. Moreover, annotating is very time consuming, error-prone, and it does not scale well large classes. Therefore, handful have recently been proposed use unlabelled establish weak supervision. In this letter we propose self-supervised method that leverages 2D optical flow as proxy for supervising...

10.1109/lra.2023.3254463 article EN IEEE Robotics and Automation Letters 2023-03-08

In this paper, we propose U-RED, an Unsupervised shape REtrieval and Deformation pipeline that takes arbitrary object observation as input, typically captured by RGB images or scans, jointly retrieves deforms the geometrically similar CAD models from a pre-established database to tightly match target. Considering existing methods fail handle noisy partial observations, U-RED is designed address issue two aspects. First, since one may correspond multiple potential full shapes, retrieval...

10.1109/iccv51070.2023.00816 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

While RGBD-based methods for category-level object pose estimation hold promise, their reliance on depth data limits applicability in diverse scenarios. In response, recent efforts have turned to RGB-based methods; however, they face significant challenges stemming from the absence of information. On one hand, lack exacerbates difficulty handling intra-class shape variation, resulting increased uncertainty predictions. other RGB-only inputs introduce inherent scale ambiguity, rendering size...

10.48550/arxiv.2409.15727 preprint EN arXiv (Cornell University) 2024-09-24

Directly regressing all 6 degrees-of-freedom (6DoF) for the object pose (e.g. 3D rotation and translation) in a cluttered environment from single RGB image is challenging problem. While end-to-end methods have recently demonstrated promising results at high efficiency, they are still inferior when compared with elaborate P$n$P/RANSAC-based approaches terms of accuracy. In this work, we address shortcoming by means novel reasoning about self-occlusion, order to establish two-layer...

10.48550/arxiv.2108.08367 preprint EN cc-by-sa arXiv (Cornell University) 2021-01-01
Coming Soon ...