- Human Pose and Action Recognition
- Advanced Vision and Imaging
- Hand Gesture Recognition Systems
- Robotics and Sensor-Based Localization
- Robot Manipulation and Learning
- Video Surveillance and Tracking Methods
- 3D Surveying and Cultural Heritage
- Human Motion and Animation
- Image Enhancement Techniques
- 3D Shape Modeling and Analysis
- Industrial Technology and Control Systems
- Computer Graphics and Visualization Techniques
- Simulation and Modeling Applications
- Manufacturing Process and Optimization
- Image Processing and 3D Reconstruction
- Muscle activation and electromyography studies
- Tactile and Sensory Interactions
- Soft Robotics and Applications
- Cyclone Separators and Fluid Dynamics
- Industrial Gas Emission Control
- COVID-19 diagnosis using AI
- Global Energy Security and Policy
- Domain Adaptation and Few-Shot Learning
- Adversarial Robustness in Machine Learning
- Wireless Body Area Networks
South China University of Technology
2025
Zhejiang University
2014-2025
State Key Laboratory of Industrial Control Technology
2023-2025
Zhejiang University of Technology
2023-2025
Dahua Technology (China)
2024
Southern University of Science and Technology
2023
Beijing University of Posts and Telecommunications
2021
University of Shanghai for Science and Technology
2020-2021
Chongqing University
2020
Microsoft Research (United Kingdom)
2020
In this paper we introduce a large-scale hand pose dataset, collected using novel capture method. Existing datasets are either generated synthetically or captured depth sensors: synthetic exhibit certain level of appearance difference from real images, and limited in quantity coverage, mainly due to the difficulty annotate them. We propose tracking system with six 6D magnetic sensors inverse kinematics automatically obtain 21-joints annotations maps minimal restriction on range motion. The...
Implicit neural representations have shown compelling results in offline 3D reconstruction and also recently demonstrated the potential for online SLAM systems. However, applying them to autonomous reconstruction, where a robot is required explore scene plan view path has not been studied. In this paper, we first time possibility of using implicit by addressing two key challenges: 1) seeking criterion measure quality candidate viewpoints planning based on new representations, 2) learning...
In this work, we present I <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> -SDF, a new method for intrinsic indoor scene reconstruction and editing using differentiable Monte Carlo raytracing on neural signed distance fields (SDFs). Our holistic SDF-based frame-work jointly recovers the underlying shapes, incident radiance materials from multi-view images. We introduce novel bubble loss fine-grained small objects error-guided adaptive...
We present the 2017 Hands in Million Challenge, a public competition designed for evaluation of task 3D hand pose estimation. The goal this challenge is to assess how far state art terms solving problem estimation as well detect major failure and strength modes both systems metrics that can help identify future research directions. follows up recent publication BigHand2.2M First-Person Hand Action datasets, which have been exhaustively cover multiple hand, viewpoint, articulation, occlusion....
Bimanual dexterous manipulation remains significant challenges in robotics due to the high DoFs of each hand and their coordination. Existing single-hand techniques often leverage human demonstrations guide RL methods but fail generalize complex bimanual tasks involving multiple sub-skills. In this paper, we introduce VTAO-BiManip, a novel framework that combines visual-tactile-action pretraining with object understanding facilitate curriculum enable human-like manipulation. We improve prior...
In this paper we introduce a large-scale hand pose dataset, collected using novel capture method. Existing datasets are either generated synthetically or captured depth sensors: synthetic exhibit certain level of appearance difference from real images, and limited in quantity coverage, mainly due to the difficulty annotate them. We propose tracking system with six 6D magnetic sensors inverse kinematics automatically obtain 21-joints annotations maps minimal restriction on range motion. The...
Implicit neural representations have shown promising potential for 3D scene reconstruction. Recent work applies it to autonomous reconstruction by learning information gain view path planning. Effective as is, the computation of is expensive, and compared with that using volumetric representations, collision checking implicit representation a point much slower. In paper, we propose 1) leverage network an function approximator field 2) combine fine-grained coarse improve efficiency. Further...
3D grasp synthesis generates grasping poses given an input object. Existing works tackle the problem by learning a direct mapping from objects to distributions of poses. However, because physical contact is sensitive small changes in pose, high-nonlinear between object representation valid considerably non-smooth, leading poor generation efficiency and restricted generality. To challenge, we introduce intermediate variable for areas constrain generation; other words, factorize into two...
Multi-object tracking (MOT) aims to build moving trajectories for number-agnostic objects. Modern multi-object trackers commonly follow the tracking-by-detection strategy. Therefore, fooling detectors can be an effective solution but it usually requires attacks in multiple successive frames, resulting low efficiency. Attacking association processes improves efficiency may require model-specific design, leading poor generalization. In this paper, we propose a novel False negative and positive...
Robotic dexterous grasping is a challenging problem due to the high degree of freedom (DoF) and complex contacts multi-fingered robotic hands. Existing deep re-inforcement learning (DRL) based methods leverage human demonstrations reduce sample complexity dimensional action space with grasping. However, less attention has been paid hand-object interaction representations for high-level generalization. In this paper, we propose novel geometric spatial representation, named DexRep, capture...
Many low-level computer vision tasks are desirable to utilize the unprocessed RAW image as input, which remains linear relationship between pixel values and scene radiance. Recent works advocate embed samples into sRGB images at capture time, reconstruct from by these metadata when needed. However, there still exist some limitations in making full use of metadata. In this paper, instead following perspective sRGB-to-RAW mapping, we reformulate problem mapping 2D coordinates its conditioned...
The Metaverse refers to the integration of physical and virtual realities, offering new possibilities for enhancing operations services across various industries. However, its application in energy sector is still nascent stage. industry, crucial global economy society, faces significant challenges due complex risky nature, such as health, safety, environmental (HSE) concerns, remote locations extraction sites. Although some studies have explored use this industry data visualization, process...
Multi-object tracking (MOT) in the scenario of low-frame-rate videos is a promising solution for deploying MOT methods on edge devices with limited computing, storage, power, and transmitting bandwidth. Tracking low frame rate poses particular challenges association stage as objects two successive frames typically exhibit much quicker variations locations, velocities, appearances, visibilities than those normal rates. In this paper, we observe severe performance degeneration many existing...
Hand pose estimation, formulated as an inverse problem, is typically optimized by energy function over parameters using a 'black box' image generation procedure, knowing little about either the relationships between or form of function. In this paper, we show significant improvement upon such black box optimization exploiting high-level knowledge parameter structure and local surrogate Our new framework, hierarchical sampling (HSO), consists sequence discriminative predictors organized into...
Learning and predicting the pose parameters of a 3D hand model given an image, such as locations joints, is challenging due to large viewpoint changes articulations, severe self-occlusions exhibited particularly in egocentric views. Both feature learning prediction modeling have been investigated tackle problem. Though effective, most existing discriminative methods yield single deterministic estimation target poses. Due their single-value mapping intrinsic, they fail adequately handle...