Qi Ye

ORCID: 0000-0003-2285-3402
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Human Pose and Action Recognition
  • Advanced Vision and Imaging
  • Hand Gesture Recognition Systems
  • Robotics and Sensor-Based Localization
  • Robot Manipulation and Learning
  • Video Surveillance and Tracking Methods
  • 3D Surveying and Cultural Heritage
  • Human Motion and Animation
  • Image Enhancement Techniques
  • 3D Shape Modeling and Analysis
  • Industrial Technology and Control Systems
  • Computer Graphics and Visualization Techniques
  • Simulation and Modeling Applications
  • Manufacturing Process and Optimization
  • Image Processing and 3D Reconstruction
  • Muscle activation and electromyography studies
  • Tactile and Sensory Interactions
  • Soft Robotics and Applications
  • Cyclone Separators and Fluid Dynamics
  • Industrial Gas Emission Control
  • COVID-19 diagnosis using AI
  • Global Energy Security and Policy
  • Domain Adaptation and Few-Shot Learning
  • Adversarial Robustness in Machine Learning
  • Wireless Body Area Networks

South China University of Technology
2025

Zhejiang University
2014-2025

State Key Laboratory of Industrial Control Technology
2023-2025

Zhejiang University of Technology
2023-2025

Dahua Technology (China)
2024

Southern University of Science and Technology
2023

Beijing University of Posts and Telecommunications
2021

University of Shanghai for Science and Technology
2020-2021

Chongqing University
2020

Microsoft Research (United Kingdom)
2020

In this paper we introduce a large-scale hand pose dataset, collected using novel capture method. Existing datasets are either generated synthetically or captured depth sensors: synthetic exhibit certain level of appearance difference from real images, and limited in quantity coverage, mainly due to the difficulty annotate them. We propose tracking system with six 6D magnetic sensors inverse kinematics automatically obtain 21-joints annotations maps minimal restriction on range motion. The...

10.1109/cvpr.2017.279 preprint EN 2017-07-01

Implicit neural representations have shown compelling results in offline 3D reconstruction and also recently demonstrated the potential for online SLAM systems. However, applying them to autonomous reconstruction, where a robot is required explore scene plan view path has not been studied. In this paper, we first time possibility of using implicit by addressing two key challenges: 1) seeking criterion measure quality candidate viewpoints planning based on new representations, 2) learning...

10.1109/lra.2023.3235686 article EN IEEE Robotics and Automation Letters 2023-01-09

In this work, we present I <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> -SDF, a new method for intrinsic indoor scene reconstruction and editing using differentiable Monte Carlo raytracing on neural signed distance fields (SDFs). Our holistic SDF-based frame-work jointly recovers the underlying shapes, incident radiance materials from multi-view images. We introduce novel bubble loss fine-grained small objects error-guided adaptive...

10.1109/cvpr52729.2023.01202 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

We present the 2017 Hands in Million Challenge, a public competition designed for evaluation of task 3D hand pose estimation. The goal this challenge is to assess how far state art terms solving problem estimation as well detect major failure and strength modes both systems metrics that can help identify future research directions. follows up recent publication BigHand2.2M First-Person Hand Action datasets, which have been exhaustively cover multiple hand, viewpoint, articulation, occlusion....

10.48550/arxiv.1707.02237 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Bimanual dexterous manipulation remains significant challenges in robotics due to the high DoFs of each hand and their coordination. Existing single-hand techniques often leverage human demonstrations guide RL methods but fail generalize complex bimanual tasks involving multiple sub-skills. In this paper, we introduce VTAO-BiManip, a novel framework that combines visual-tactile-action pretraining with object understanding facilitate curriculum enable human-like manipulation. We improve prior...

10.48550/arxiv.2501.03606 preprint EN arXiv (Cornell University) 2025-01-07

In this paper we introduce a large-scale hand pose dataset, collected using novel capture method. Existing datasets are either generated synthetically or captured depth sensors: synthetic exhibit certain level of appearance difference from real images, and limited in quantity coverage, mainly due to the difficulty annotate them. We propose tracking system with six 6D magnetic sensors inverse kinematics automatically obtain 21-joints annotations maps minimal restriction on range motion. The...

10.48550/arxiv.1704.02612 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Implicit neural representations have shown promising potential for 3D scene reconstruction. Recent work applies it to autonomous reconstruction by learning information gain view path planning. Effective as is, the computation of is expensive, and compared with that using volumetric representations, collision checking implicit representation a point much slower. In paper, we propose 1) leverage network an function approximator field 2) combine fine-grained coarse improve efficiency. Further...

10.1109/icra48891.2023.10160793 article EN 2023-05-29

3D grasp synthesis generates grasping poses given an input object. Existing works tackle the problem by learning a direct mapping from objects to distributions of poses. However, because physical contact is sensitive small changes in pose, high-nonlinear between object representation valid considerably non-smooth, leading poor generation efficiency and restricted generality. To challenge, we introduce intermediate variable for areas constrain generation; other words, factorize into two...

10.24963/ijcai.2023/117 article EN 2023-08-01

Multi-object tracking (MOT) aims to build moving trajectories for number-agnostic objects. Modern multi-object trackers commonly follow the tracking-by-detection strategy. Therefore, fooling detectors can be an effective solution but it usually requires attacks in multiple successive frames, resulting low efficiency. Attacking association processes improves efficiency may require model-specific design, leading poor generalization. In this paper, we propose a novel False negative and positive...

10.1109/iccv51070.2023.00422 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Robotic dexterous grasping is a challenging problem due to the high degree of freedom (DoF) and complex contacts multi-fingered robotic hands. Existing deep re-inforcement learning (DRL) based methods leverage human demonstrations reduce sample complexity dimensional action space with grasping. However, less attention has been paid hand-object interaction representations for high-level generalization. In this paper, we propose novel geometric spatial representation, named DexRep, capture...

10.1109/iros55552.2023.10342334 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2023-10-01

Many low-level computer vision tasks are desirable to utilize the unprocessed RAW image as input, which remains linear relationship between pixel values and scene radiance. Recent works advocate embed samples into sRGB images at capture time, reconstruct from by these metadata when needed. However, there still exist some limitations in making full use of metadata. In this paper, instead following perspective sRGB-to-RAW mapping, we reformulate problem mapping 2D coordinates its conditioned...

10.1109/cvpr52729.2023.01745 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2023-06-01

The Metaverse refers to the integration of physical and virtual realities, offering new possibilities for enhancing operations services across various industries. However, its application in energy sector is still nascent stage. industry, crucial global economy society, faces significant challenges due complex risky nature, such as health, safety, environmental (HSE) concerns, remote locations extraction sites. Although some studies have explored use this industry data visualization, process...

10.1109/tcyb.2024.3475272 article EN IEEE Transactions on Cybernetics 2024-11-04

Multi-object tracking (MOT) in the scenario of low-frame-rate videos is a promising solution for deploying MOT methods on edge devices with limited computing, storage, power, and transmitting bandwidth. Tracking low frame rate poses particular challenges association stage as objects two successive frames typically exhibit much quicker variations locations, velocities, appearances, visibilities than those normal rates. In this paper, we observe severe performance degeneration many existing...

10.1145/3503161.3548162 article EN Proceedings of the 30th ACM International Conference on Multimedia 2022-10-10

Hand pose estimation, formulated as an inverse problem, is typically optimized by energy function over parameters using a 'black box' image generation procedure, knowing little about either the relationships between or form of function. In this paper, we show significant improvement upon such black box optimization exploiting high-level knowledge parameter structure and local surrogate Our new framework, hierarchical sampling (HSO), consists sequence discriminative predictors organized into...

10.1109/tpami.2018.2847688 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2018-06-15

Learning and predicting the pose parameters of a 3D hand model given an image, such as locations joints, is challenging due to large viewpoint changes articulations, severe self-occlusions exhibited particularly in egocentric views. Both feature learning prediction modeling have been investigated tackle problem. Though effective, most existing discriminative methods yield single deterministic estimation target poses. Due their single-value mapping intrinsic, they fail adequately handle...

10.48550/arxiv.1711.10872 preprint EN other-oa arXiv (Cornell University) 2017-01-01
Coming Soon ...