NFDI4DS | UHH-SEMS - Publication Details

Volumetric and Multi-view CNNs for Object Classification on 3D Data

OPENALEX - Publications

Charles R. Qi Hao Su Matthias NieBner Angela Dai Mengyuan Yan and 1 more

3D shape models are becoming widely available and easier to capture, making information crucial for progress in object classification. Current state-of-theart methods rely on CNNs address this problem. Recently, we witness two types of being developed: based upon volumetric representations versus multi-view representations. Empirical results from these exhibit a large gap, indicating that existing CNN architectures approaches unable fully exploit the power In paper, aim improve both...

10.1109/cvpr.2016.609 article EN 2016-06-01

Do As I Can, Not As I Say: Grounding Language in Robotic Affordances

OPENALEX - Publications

Michael J. Ahn Anthony Brohan Noah Brown Yevgen Chebotar Omar Andrés Carmona Cortes and 38 more

Large language models can encode a wealth of semantic knowledge about the world. Such could be extremely useful to robots aiming act upon high-level, temporally extended instructions expressed in natural language. However, significant weakness is that they lack real-world experience, which makes it difficult leverage them for decision making within given embodiment. For example, asking model describe how clean spill might result reasonable narrative, but may not applicable particular agent,...

10.48550/arxiv.2204.01691 preprint EN other-oa arXiv (Cornell University) 2022-01-01

MeteorNet: Deep Learning on Dynamic 3D Point Cloud Sequences

OPENALEX - Publications

Xingyu Liu Mengyuan Yan Jeannette Bohg

Understanding dynamic 3D environment is crucial for robotic agents and many other applications. We propose a novel neural network architecture called MeteorNet learning representations point cloud sequences. Different from previous work that adopts grid-based representation applies or 4D convolutions, our directly processes clouds. two ways to construct spatiotemporal neighborhoods each in the sequence. Information these aggregated learn features per point. benchmark on variety of...

10.1109/iccv.2019.00934 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

Self-Supervised Learning of State Estimation for Manipulating Deformable Linear Objects

OPENALEX - Publications

Mengyuan Yan Yilin Zhu Ning Jin Jeannette Bohg

We demonstrate model-based, visual robot manipulation of deformable linear objects. Our approach is based on a state-space representation the physical system that aims to control. This choice has multiple advantages, including ease incorporating physics priors in dynamics model and perception model, planning actions. In addition, states can naturally represent object instances different appearances. Therefore, state space be learned one setting directly used other visually settings. contrast...

10.1109/lra.2020.2969931 article EN IEEE Robotics and Automation Letters 2020-01-28

Sim-to-Real Transfer of Accurate Grasping with Eye-In-Hand Observations and Continuous Control

OPENALEX - Publications

Mengyuan Yan Iuri Frosio Stephen Tyree Jan Kautz

In the context of deep learning for robotics, we show effective method training a real robot to grasp tiny sphere (1.37cm diameter), with an original combination system design choices. We decompose end-to-end into vision module and closed-loop controller module. The two modules use target object segmentation as their common interface. extracts information from end-effector camera, in form binary mask target. train it achieve domain transfer by composing background images simulated takes...

10.48550/arxiv.1712.03303 preprint EN other-oa arXiv (Cornell University) 2017-01-01

Jump-Start Reinforcement Learning

OPENALEX - Publications

Ikechukwu Uchendu Ted Xiao Yao Lu Banghua Zhu Mengyuan Yan and 7 more

Reinforcement learning (RL) provides a theoretical framework for continuously improving an agent's behavior via trial and error. However, efficiently policies from scratch can be very difficult, particularly tasks with exploration challenges. In such settings, it might desirable to initialize RL existing policy, offline data, or demonstrations. naively performing initialization in often works poorly, especially value-based methods. this paper, we present meta algorithm that use...

10.48550/arxiv.2204.02372 preprint EN other-oa arXiv (Cornell University) 2022-01-01

3D Reconstruction from Full-view Fisheye Camera

OPENALEX - Publications

Chuiwen Ma Liang Shi Hanlu Huang Mengyuan Yan

In this report, we proposed a 3D reconstruction method for the full-view fisheye camera. The camera used is Ricoh Theta, which captures spherical images and has wide field of view (FOV). conventional stereo apporach based on perspective model cannot be directly applied instead to depict relation between point its corresponding observation in image. We implemented system that can reconstruct scene using from two or more cameras. A GUI also created allow users control obtain better intuition...

10.48550/arxiv.1506.06273 preprint EN other-oa arXiv (Cornell University) 2015-01-01

Learning Probabilistic Multi-Modal Actor Models for Vision-Based Robotic Grasping

OPENALEX - Publications

Mengyuan Yan Adrian Li Mrinal Kalakrishnan Peter Pástor

Many previous works approach vision-based robotic grasping by training a value network that evaluates grasp proposals. These approaches require an optimization process at run-time to infer the best action from network. As result, inference time grows exponentially as dimension of space increases. We propose alternative method, directly neural density model approximate conditional distribution successful poses input images. construct combines Gaussian mixture and normalizing flows, which is...

10.1109/icra.2019.8794024 article EN 2022 International Conference on Robotics and Automation (ICRA) 2019-05-01

Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators

OPENALEX - Publications

Alexander Herzog Kanishka Rao Karol Hausman Yao Lü Paul Wohlhart and 35 more

We describe a system for deep reinforcement learning of robotic manipulation skills applied to large-scale realworld task: sorting recyclables and trash in office buildings.Real-world deployment RL policies requires not only effective training algorithms, but the ability bootstrap enable broad generalization.To this end, our combines scalable from real-world data with bootstrapping simulation, incorporates auxiliary inputs existing computer vision systems as way boost generalization novel...

10.15607/rss.2023.xix.022 article EN 2023-07-10

Learning Topological Motion Primitives for Knot Planning

OPENALEX - Publications

Mengyuan Yan Gen Li Yilin Zhu Jeannette Bohg

In this paper, we approach the challenging problem of motion planning for knot tying. We propose a hierarchical in which top layer produces topological plan and bottom translates into continuous robot motion. The decomposes knotting task sequences abstract actions based on theory. each these trajectories through learned primitives. To adapt action to specific rope geometry, primitives take observed configuration as input. train by imitating human demonstrations reinforcement learning...

10.1109/iros45743.2020.9341330 article EN 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2020-10-24

AW-Opt: Learning Robotic Skills with Imitation and Reinforcement at Scale

OPENALEX - Publications

Yao Lu Karol Hausman Yevgen Chebotar Mengyuan Yan Eric B. Jang and 6 more

Robotic skills can be learned via imitation learning (IL) using user-provided demonstrations, or reinforcement (RL) large amountsof autonomously collected experience.Both methods have complementarystrengths and weaknesses: RL reach a high level of performance, but requiresexploration, which very time consuming unsafe; IL does not requireexploration, only learns that are as good the provided demonstrations.Can single method combine strengths both approaches? A number ofprior aimed to address...

10.48550/arxiv.2111.05424 preprint EN public-domain arXiv (Cornell University) 2021-01-01

Modified Auto-Focusing Algorithm for High Squint Diving SAR Imaging Based on the Back-Projection Algorithm with Spectrum Alignment and Truncation

OPENALEX - Publications

Anqi Gao Bing Sun Mengyuan Yan Xue Chen Jingwen Li

The study focuses on addressing the image defocusing issue caused by motion errors in highly squinted Synthetic Aperture Radar (SAR). traditional auto-focusing algorithm, Phase Gradient Autofocus (PGA), is not effective this mode due to difficulties estimating phase gradient accurately from strong point targets. Two main reasons contribute problem. Firstly, direction of energy-distributed lines Point Spread Function (PSF) does align with image’s azimuth mode. Secondly, wavenumber spectrum...

10.3390/rs15122976 article EN cc-by Remote Sensing 2023-06-07

Nonlocal Elastica Model for Sparse Reconstruction

OPENALEX - Publications

Mengyuan Yan Yuping Duan

10.1007/s10851-019-00943-7 article EN Journal of Mathematical Imaging and Vision 2020-01-23

How to Close Sim-Real Gap? Transfer with Segmentation!

OPENALEX - Publications

Mengyuan Yan Qingyun Sun Iuri Frosio Stephen Tyree Jan Kautz

One fundamental difficulty in robotic learning is the sim-real gap problem. In this work, we propose to use segmentation as interface between perception and control, a domain-invariant state representation. We identify two sources of gap, one dynamics other visual gap. To close closed-loop control. For complex task with mask input, further learn model-free control policy deep neural network using imitation learning. model real environment simulated target plus background image, without any...

10.48550/arxiv.2005.07695 preprint EN other-oa arXiv (Cornell University) 2020-01-01

GRAC: Self-Guided and Self-Regularized Actor-Critic

OPENALEX - Publications

Lin Shao Yifan You Mengyuan Yan Qingyun Sun Jeannette Bohg

Deep reinforcement learning (DRL) algorithms have successfully been demonstrated on a range of challenging decision making and control tasks. One dominant component recent deep is the target network which mitigates divergence when Q function. However, networks can slow down process due to delayed function updates. Our main contribution in this work self-regularized TD-learning method address without requiring network. Additionally, we propose self-guided policy improvement by combining...

10.48550/arxiv.2009.08973 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Volumetric and Multi-View CNNs for Object Classification on 3D Data

OPENALEX - Publications

Charles R. Qi Hao Su Matthias Nießner Angela Dai Mengyuan Yan and 1 more

3D shape models are becoming widely available and easier to capture, making information crucial for progress in object classification. Current state-of-the-art methods rely on CNNs address this problem. Recently, we witness two types of being developed: based upon volumetric representations versus multi-view representations. Empirical results from these exhibit a large gap, indicating that existing CNN architectures approaches unable fully exploit the power In paper, aim improve both...

10.48550/arxiv.1604.03265 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Analysis of ignition by a plane laminar thermal plume

OPENALEX - Publications

Mengyuan Yan Lea-Der Chen G. M. Faeth

10.1016/0010-2180(84)90073-7 article EN Combustion and Flame 1984-10-01

Learning Probabilistic Multi-Modal Actor Models for Vision-Based Robotic Grasping

OPENALEX - Publications

Mengyuan Yan Adrian Li Mrinal Kalakrishnan Peter Pástor

Many previous works approach vision-based robotic grasping by training a value network that evaluates grasp proposals. These approaches require an optimization process at run-time to infer the best action from network. As result, inference time grows exponentially as dimension of space increases. We propose alternative method, directly neural density model approximate conditional distribution successful poses input images. construct combines Gaussian mixture and normalizing flows, which is...

10.48550/arxiv.1904.07319 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Self-Supervised Learning of State Estimation for Manipulating Deformable Linear Objects

OPENALEX - Publications

Mengyuan Yan Yilin Zhu Ning Jin Jeannette Bohg

We demonstrate model-based, visual robot manipulation of linear deformable objects. Our approach is based on a state-space representation the physical system that aims to control. This choice has multiple advantages, including ease incorporating physics priors in dynamics model and perception model, planning actions. In addition, states can naturally represent object instances different appearances. Therefore, state space be learned one setting directly used other visually settings. contrast...

10.48550/arxiv.1911.06283 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Learning Topological Motion Primitives for Knot Planning

OPENALEX - Publications

Mengyuan Yan Gen Li Yilin Zhu Jeannette Bohg

In this paper, we approach the challenging problem of motion planning for knot tying. We propose a hierarchical in which top layer produces topological plan and bottom translates into continuous robot motion. The decomposes knotting task sequences abstract actions based on theory. each these trajectories through learned primitives. To adapt action to specific rope geometry, primitives take observed configuration as input. train by imitating human demonstrations reinforcement learning...

10.48550/arxiv.2009.02615 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Deep RL at Scale: Sorting Waste in Office Buildings with a Fleet of Mobile Manipulators

OPENALEX - Publications

Alexander Herzog Kanishka Rao Karol Hausman Yao Lu Paul Wohlhart and 35 more

We describe a system for deep reinforcement learning of robotic manipulation skills applied to large-scale real-world task: sorting recyclables and trash in office buildings. Real-world deployment RL policies requires not only effective training algorithms, but the ability bootstrap enable broad generalization. To this end, our combines scalable from data with bootstrapping simulation, incorporates auxiliary inputs existing computer vision systems as way boost generalization novel objects,...

10.48550/arxiv.2305.03270 preprint EN cc-by arXiv (Cornell University) 2023-01-01

Stabilization of premixed flames in mixed-convection laminar plumes

OPENALEX - Publications

Mengyuan Yan G. M. Faeth

10.1016/0010-2180(86)90107-0 article EN Combustion and Flame 1986-01-01

MeteorNet: Deep Learning on Dynamic 3D Point Cloud Sequences

OPENALEX - Publications

Xingyu Liu Mengyuan Yan Jeannette Bohg

Understanding dynamic 3D environment is crucial for robotic agents and many other applications. We propose a novel neural network architecture called $MeteorNet$ learning representations point cloud sequences. Different from previous work that adopts grid-based representation applies or 4D convolutions, our directly processes clouds. two ways to construct spatiotemporal neighborhoods each in the sequence. Information these aggregated learn features per point. benchmark on variety of...

10.48550/arxiv.1910.09165 preprint EN other-oa arXiv (Cornell University) 2019-01-01