- Robot Manipulation and Learning
- Human Pose and Action Recognition
- Multimodal Machine Learning Applications
- Advanced Algorithms and Applications
- Image Processing and 3D Reconstruction
- Advanced Vision and Imaging
- Tactile and Sensory Interactions
- Advanced Neural Network Applications
- Domain Adaptation and Few-Shot Learning
- Advanced Image Processing Techniques
- 3D Shape Modeling and Analysis
- Cell Image Analysis Techniques
- Advanced Computational Techniques and Applications
- Metaheuristic Optimization Algorithms Research
- Advanced Sensor and Control Systems
- Robotic Mechanisms and Dynamics
- Artificial Immune Systems Applications
- Generative Adversarial Networks and Image Synthesis
- Manufacturing Process and Optimization
- Reinforcement Learning in Robotics
- Adversarial Robustness in Machine Learning
- Machine Learning and Data Classification
- Video Analysis and Summarization
- Industrial Vision Systems and Defect Detection
- Advanced machining processes and optimization
King University
2024
Peking University
2020-2024
Sanden (Japan)
2011
University of Jinan
2009-2010
Shape assembly aims to reassemble parts (or fragments) into a complete object, which is common task in our daily life. Different from the semantic part (e.g., assembling chair's like legs whole chair), geometric bowl fragments bowl) an emerging computer vision and robotics. Instead of information, this focuses on information parts. As both pose space fractured are exceptionally large, shape disentanglement representations beneficial assembly. In paper, we propose leverage SE(3) equivariance...
Despite deep convolutional neural networks' great success in object classification, it suffers from severe generalization performance drop under occlusion due to the inconsistency between training and testing data. Because of large variance occluders, our goal is a model trained on occlusion-free data while generalizable conditions. In this work, we integrate prototypes, partial matching top-down attention regulation into networks realize robust classification occlusion. We first introduce...
Understanding and manipulating deformable objects (e.g., ropes fabrics) is an essential yet challenging task with broad applications. Difficulties come from complex states dynamics, diverse configurations high-dimensional action space of objects. Besides, the manipulation tasks usually require multiple steps to accomplish, greedy policies may easily lead local optimal states. Existing studies tackle this problem using reinforcement learning or imitating expert demonstrations, limitations in...
Visual relation detection (VRD) aims to describe all interacting objects in an image using subject-predicate-object triplets. Critically, valid relations combinatorially grow O(C2 R) for C object categories and R relationships. The frequencies of triplets exhibit a long-tailed distribution, which inevitably leads bias towards popular visual the learned VRD model. To address this problem, we propose localize-assemble-predicate network (LAP-Net), decomposes into three sub-tasks: localizing...
To solve the problem of quantum-behaved particle swarm optimization algorithm (QPSO) easy falling into local optima, we proposed multiple immune clonal in which was divided two subgroups dynamically according to particle's fitness. In better fitness subgroup, carried with Gaussian mutation do searching, and other subgroup Cauchy global searching. And also made a compare standard wavelet de-noise. From these experiment results can see that this improved method had ability searching optimum...
In dealing with the problem of quantum-behaved particle swarm optimization algorithm (QPSO) easy falling into local optima, we proposed diversity guided immune clonal QPSO. this was defined two states: attraction and expansion. During process transferred between states repeatedly reference to diversity. When in state if is less than pre-established value, will carry do searching. And used wavelet forecast foundation settlement, also made a compare standard wavelet. The experiment indicated...
Shape assembly aims to reassemble parts (or fragments) into a complete object, which is common task in our daily life. Different from the semantic part (e.g., assembling chair's like legs whole chair), geometric bowl fragments bowl) an emerging computer vision and robotics. Instead of information, this focuses on information parts. As both pose space fractured are exceptionally large, shape disentanglement representations beneficial assembly. In paper, we propose leverage SE(3) equivariance...
With the development of Embodied AI, Robotics and Augmented Reality, videos captured from 'first-person' point view, also known as egocentric videos, are arousing interests in Computer Vision communities. Further, learning a proper representation can benefit diverse downstream tasks like action forecasting human object interactions, further beneficial for robotic planning. However, current works mostly focus on temporal or topological information video representations, while activity...
Manipulating garments and fabrics has long been a critical endeavor in the development of home-assistant robots. However, due to complex dynamics topological structures, garment manipulations pose significant challenges. Recent successes reinforcement learning vision-based methods offer promising avenues for manipulation. Nevertheless, these approaches are severely constrained by current benchmarks, which limited diversity tasks unrealistic simulation behavior. Therefore, we present...
Visual actionable affordance has emerged as a transformative approach in robotics, focusing on perceiving interaction areas prior to manipulation. Traditional methods rely pixel sampling identify successful samples or processing pointclouds for mapping. However, these approaches are computationally intensive and struggle adapt diverse dynamic environments. This paper introduces ManipGPT, framework designed predict optimal articulated objects using large pre-trained vision transformer (ViT)....
Humans perceive and interact with the world awareness of equivariance, facilitating us in manipulating different objects diverse poses. For robotic manipulation, such equivariance also exists many scenarios. example, no matter what pose a drawer is (translation, rotation tilt), manipulation strategy consistent (grasp handle pull line). While traditional models usually do not have for which might result more data training poor performance novel object poses, we propose our EqvAfford...
Unpaired image-to-image translation is a class of vision problems whose goal to find the mapping between different image domains using unpaired training data. Cycle-consistency loss widely used constraint for such problems. However, due strict pixel-level constraint, it cannot perform geometric changes, remove large objects, or ignore irrelevant texture. In this paper, we propose novel adversarial-consistency translation. This does not require translated be back specific source but can...
It is essential yet challenging for future home-assistant robots to understand and manipulate diverse 3D objects in daily human environments. Towards building scalable systems that can perform manipulation tasks over various shapes, recent works have advocated demonstrated promising results learning visual actionable affordance, which labels every point the input geometry with an action likelihood of accomplishing downstream task (e.g., pushing or picking-up). However, these only studied...
Learning an accurate model of the environment is essential for model-based control tasks. Existing methods in robotic visuomotor usually learn from data with heavily labelled actions, object entities or locations, which can be demanding many cases. To cope this limitation, we propose a method, dubbed DMotion, that trains forward video only, via disentangling motion controllable agent to transition dynamics. An extractor and interaction learner are trained end-to-end manner without...
Understanding and manipulating deformable objects (e.g., ropes fabrics) is an essential yet challenging task with broad applications. Difficulties come from complex states dynamics, diverse configurations high-dimensional action space of objects. Besides, the manipulation tasks usually require multiple steps to accomplish, greedy policies may easily lead local optimal states. Existing studies tackle this problem using reinforcement learning or imitating expert demonstrations, limitations in...
Articulated objects (e.g., doors and drawers) exist everywhere in our life. Different from rigid objects, articulated have higher degrees of freedom are rich geometries, semantics, part functions. Modeling different kinds parts articulations with nerual networks plays an essential role object understanding manipulation, will further benefit 3D vision robotics communities. To model most previous works directly encode into feature representations, without specific designs for parts, motions....
In dealing with the problem of premature, swarm was divided into different types and update strategy carried on each swarm. order to improve algorithm's convergence precise we also introduced chaos mutation operations increase particles' diversity. Meanwhile in remove noise raw foundation settlement data, wavelet algorithm. And made a compare standard particle optimization forecast settlement. The experiment indicated that this method had better global local searching ability high precision.