NFDI4DS | UHH-SEMS - Publication Details

Leveraging SE(3) Equivariance for Learning 3D Geometric Shape Assembly

OPENALEX - Publications

Ruihai Wu Chenrui Tie Yushi Du Yan Zhao Hao Dong

Shape assembly aims to reassemble parts (or fragments) into a complete object, which is common task in our daily life. Different from the semantic part (e.g., assembling chair's like legs whole chair), geometric bowl fragments bowl) an emerging computer vision and robotics. Instead of information, this focuses on information parts. As both pose space fractured are exceptionally large, shape disentanglement representations beneficial assembly. In paper, we propose leverage SE(3) equivariance...

10.1109/iccv51070.2023.01316 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

TDAPNet: Prototype Network with Recurrent Top-Down Attention for Robust Object Classification under Partial Occlusion

OPENALEX - Publications

Xiao Ming-qing Adam Kortylewski Ruihai Wu Siyuan Qiao Wei Shen and 1 more

Despite deep convolutional neural networks' great success in object classification, it suffers from severe generalization performance drop under occlusion due to the inconsistency between training and testing data. Because of large variance occluders, our goal is a model trained on occlusion-free data while generalizable conditions. In this work, we integrate prototypes, partial matching top-down attention regulation into networks realize robust classification occlusion. We first introduce...

10.48550/arxiv.1909.03879 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Learning Foresightful Dense Visual Affordance for Deformable Object Manipulation

OPENALEX - Publications

Ruihai Wu Chuanruo Ning Hao Dong

Understanding and manipulating deformable objects (e.g., ropes fabrics) is an essential yet challenging task with broad applications. Difficulties come from complex states dynamics, diverse configurations high-dimensional action space of objects. Besides, the manipulation tasks usually require multiple steps to accomplish, greedy policies may easily lead local optimal states. Existing studies tackle this problem using reinforcement learning or imitating expert demonstrations, limitations in...

10.1109/iccv51070.2023.01005 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2023-10-01

Articulated Object Manipulation with Coarse-to-fine Affordance for Mitigating the Effect of Point Cloud Noise

OPENALEX - Publications

Suhan Ling Yian Wang Ruihai Wu Shiguang Wu Yuzheng Zhuang and 4 more

10.1109/icra57147.2024.10610593 article EN 2024-05-13

Localize, Assemble, and Predicate: Contextual Object Proposal Embedding for Visual Relation Detection

OPENALEX - Publications

Ruihai Wu Kehan Xu Chenchen Liu Nan Zhuang Yadong Mu

Visual relation detection (VRD) aims to describe all interacting objects in an image using subject-predicate-object triplets. Critically, valid relations combinatorially grow O(C2 R) for C object categories and R relationships. The frequencies of triplets exhibit a long-tailed distribution, which inevitably leads bias towards popular visual the learned VRD model. To address this problem, we propose localize-assemble-predicate network (LAP-Net), decomposes into three sub-tasks: localizing...

10.1609/aaai.v34i07.6913 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Multiple swarms immune clonal quantum-behaved particle swarm optimization algorithm and the wavelet in the application of forecasting foundation settlement

OPENALEX - Publications

Qiqing Duan Ruihai Wu Jiwen Dong

To solve the problem of quantum-behaved particle swarm optimization algorithm (QPSO) easy falling into local optima, we proposed multiple immune clonal in which was divided two subgroups dynamically according to particle's fitness. In better fitness subgroup, carried with Gaussian mutation do searching, and other subgroup Cauchy global searching. And also made a compare standard wavelet de-noise. From these experiment results can see that this improved method had ability searching optimum...

10.1109/car.2010.5456640 article EN 2010-03-01

Diversity guided immune clonal quantum-behaved particle swarm optimization algorithm and the wavelet in the forecasting of foundation settlement

OPENALEX - Publications

Jiwen Dong Ruihai Wu

In dealing with the problem of quantum-behaved particle swarm optimization algorithm (QPSO) easy falling into local optima, we proposed diversity guided immune clonal QPSO. this was defined two states: attraction and expansion. During process transferred between states repeatedly reference to diversity. When in state if is less than pre-established value, will carry do searching. And used wavelet forecast foundation settlement, also made a compare standard wavelet. The experiment indicated...

10.1109/icemi.2009.5274249 article EN 2009-08-01

Leveraging SE(3) Equivariance for Learning 3D Geometric Shape Assembly

OPENALEX - Publications

Ruihai Wu Chenrui Tie Yushi Du Yan Zhao Hao Dong

Shape assembly aims to reassemble parts (or fragments) into a complete object, which is common task in our daily life. Different from the semantic part (e.g., assembling chair's like legs whole chair), geometric bowl fragments bowl) an emerging computer vision and robotics. Instead of information, this focuses on information parts. As both pose space fractured are exceptionally large, shape disentanglement representations beneficial assembly. In paper, we propose leverage SE(3) equivariance...

10.48550/arxiv.2309.06810 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Pattern4Ego: Learning Egocentric Video Representation Using Cross-video Activity Patterns

OPENALEX - Publications

Ruihai Wu Yourong Zhang Yu Qi A. Chen Hao Dong

With the development of Embodied AI, Robotics and Augmented Reality, videos captured from 'first-person' point view, also known as egocentric videos, are arousing interests in Computer Vision communities. Further, learning a proper representation can benefit diverse downstream tasks like action forecasting human object interactions, further beneficial for robotic planning. However, current works mostly focus on temporal or topological information video representations, while activity...

10.1145/3652583.3658010 article EN 2024-05-30

NaturalVLM: Leveraging Fine-grained Natural Language for Affordance-Guided Visual Manipulation

OPENALEX - Publications

Ran Xu Yan Shen Xiaoqi Li Ruihai Wu Hao Dong

10.1109/lra.2024.3477095 article EN cc-by-nc-nd IEEE Robotics and Automation Letters 2024-01-01

GarmentLab: A Unified Simulation and Benchmark for Garment Manipulation

OPENALEX - Publications

Han Lü Ruihai Wu Yitong Li Sijie Li Ziyu Zhu and 5 more

Manipulating garments and fabrics has long been a critical endeavor in the development of home-assistant robots. However, due to complex dynamics topological structures, garment manipulations pose significant challenges. Recent successes reinforcement learning vision-based methods offer promising avenues for manipulation. Nevertheless, these approaches are severely constrained by current benchmarks, which limited diversity tasks unrealistic simulation behavior. Therefore, we present...

10.48550/arxiv.2411.01200 preprint EN arXiv (Cornell University) 2024-11-02

ManipGPT: Is Affordance Segmentation by Large Vision Models Enough for Articulated Object Manipulation?

OPENALEX - Publications

Taewhan Kim H Bae Zeming Li Xiaoqi Li Iaroslav Ponomarenko and 2 more

Visual actionable affordance has emerged as a transformative approach in robotics, focusing on perceiving interaction areas prior to manipulation. Traditional methods rely pixel sampling identify successful samples or processing pointclouds for mapping. However, these approaches are computationally intensive and struggle adapt diverse dynamic environments. This paper introduces ManipGPT, framework designed predict optimal articulated objects using large pre-trained vision transformer (ViT)....

10.48550/arxiv.2412.10050 preprint EN arXiv (Cornell University) 2024-12-13

EqvAfford: SE(3) Equivariance for Point-Level Affordance Learning

OPENALEX - Publications

Chen Yue Chenrui Tie Ruihai Wu Hao Dong

Humans perceive and interact with the world awareness of equivariance, facilitating us in manipulating different objects diverse poses. For robotic manipulation, such equivariance also exists many scenarios. example, no matter what pose a drawer is (translation, rotation tilt), manipulation strategy consistent (grasp handle pull line). While traditional models usually do not have for which might result more data training poor performance novel object poses, we propose our EqvAfford...

10.48550/arxiv.2408.01953 preprint EN arXiv (Cornell University) 2024-08-04

Unpaired Image-to-Image Translation using Adversarial Consistency Loss

OPENALEX - Publications

Yihao Zhao Ruihai Wu Hao Dong

Unpaired image-to-image translation is a class of vision problems whose goal to find the mapping between different image domains using unpaired training data. Cycle-consistency loss widely used constraint for such problems. However, due strict pixel-level constraint, it cannot perform geometric changes, remove large objects, or ignore irrelevant texture. In this paper, we propose novel adversarial-consistency translation. This does not require translated be back specific source but can...

10.48550/arxiv.2003.04858 preprint EN other-oa arXiv (Cornell University) 2020-01-01

DualAfford: Learning Collaborative Visual Affordance for Dual-gripper Manipulation

OPENALEX - Publications

Yan Zhao Ruihai Wu Zhehuan Chen Yourong Zhang Qingnan Fan and 2 more

It is essential yet challenging for future home-assistant robots to understand and manipulate diverse 3D objects in daily human environments. Towards building scalable systems that can perform manipulation tasks over various shapes, recent works have advocated demonstrated promising results learning visual actionable affordance, which labels every point the input geometry with an action likelihood of accomplishing downstream task (e.g., pushing or picking-up). However, these only studied...

10.48550/arxiv.2207.01971 preprint EN other-oa arXiv (Cornell University) 2022-01-01

DMotion: Robotic Visuomotor Control with Unsupervised Forward Model Learned from Videos

OPENALEX - Publications

Haoqi Yuan Ruihai Wu Andrew Zhao Haipeng Zhang Zihan Ding and 1 more

Learning an accurate model of the environment is essential for model-based control tasks. Existing methods in robotic visuomotor usually learn from data with heavily labelled actions, object entities or locations, which can be demanding many cases. To cope this limitation, we propose a method, dubbed DMotion, that trains forward video only, via disentangling motion controllable agent to transition dynamics. An extractor and interaction learner are trained end-to-end manner without...

10.1109/iros51168.2021.9636362 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2021-09-27

Learning Foresightful Dense Visual Affordance for Deformable Object Manipulation

OPENALEX - Publications

Ruihai Wu Chuanruo Ning Hao Dong

Understanding and manipulating deformable objects (e.g., ropes fabrics) is an essential yet challenging task with broad applications. Difficulties come from complex states dynamics, diverse configurations high-dimensional action space of objects. Besides, the manipulation tasks usually require multiple steps to accomplish, greedy policies may easily lead local optimal states. Existing studies tackle this problem using reinforcement learning or imitating expert demonstrations, limitations in...

10.48550/arxiv.2303.11057 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Learning Part Motion of Articulated Objects Using Spatially Continuous Neural Implicit Representations

OPENALEX - Publications

Yushi Du Ruihai Wu Yan Shen Hao Dong

Articulated objects (e.g., doors and drawers) exist everywhere in our life. Different from rigid objects, articulated have higher degrees of freedom are rich geometries, semantics, part functions. Modeling different kinds parts articulations with nerual networks plays an essential role object understanding manipulation, will further benefit 3D vision robotics communities. To model most previous works directly encode into feature representations, without specific designs for parts, motions....

10.48550/arxiv.2311.12407 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Application of improved PSO algorithm and wavelet analysis in foundation settlement prediction

OPENALEX - Publications

Jiwen Dong Ruihai Wu Duan Qi-qing

10.3724/sp.j.1087.02723 article EN Journal of Computer Applications 2009-12-18

Application of foundation settlement prediction based on improved particle swarm algorithm and wavelet de-noising ground settlement prediction

OPENALEX - Publications

Qiqing Duan Ruihai Wu Jiwen Dong

In dealing with the problem of premature, swarm was divided into different types and update strategy carried on each swarm. order to improve algorithm's convergence precise we also introduced chaos mutation operations increase particles' diversity. Meanwhile in remove noise raw foundation settlement data, wavelet algorithm. And made a compare standard particle optimization forecast settlement. The experiment indicated that this method had better global local searching ability high precision.

10.1109/icfcc.2010.5497842 article EN 2010-01-01