NFDI4DS | UHH-SEMS - Publication Details

Detecting Faces Using Inside Cascaded Contextual CNN

OPENALEX - Publications

Kaipeng Zhang Zhanpeng Zhang Hao Wang Zhifeng Li Yu Qiao and 1 more

Deep Convolutional Neural Networks (CNNs) achieve substantial improvements in face detection the wild. Classical CNN-based methods simply stack successive layers of filters where an input sample should pass through all before reaching a face/non-face decision. Inspired by fact that for detection, deeper can discriminate between difficult samples while those shallower efficiently reject simple non-face samples, we propose Inside Cascaded Structure introduces classifiers at different within...

10.1109/iccv.2017.344 article EN 2017-10-01

Multi-Modality Latent Interaction Network for Visual Question Answering

OPENALEX - Publications

Peng Gao Haoxuan You Zhanpeng Zhang Xiaogang Wang Hongsheng Li

Exploiting relationships between visual regions and question words have achieved great success in learning multi-modality features for Visual Question Answering (VQA). However, we argue that existing methods mostly model relations individual words, which are not enough to correctly answer the question. From humans' perspective, answering a requires understanding summarizations of language information. In this paper, proposed Multi-modality Latent Interaction module (MLI) tackle problem. The...

10.1109/iccv.2019.00592 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

A human activity recognition method using wearable sensors based on convtransformer model

OPENALEX - Publications

Zhanpeng Zhang Wenting Wang Aimin An Yuwei Qin Fazhi Yang

10.1007/s12530-022-09480-y article EN Evolving Systems 2023-01-03

Mem2Ego: Empowering Vision-Language Models with Global-to-Ego Memory for Long-Horizon Embodied Navigation

OPENALEX - Publications

Lingfeng Zhang Yuecheng Liu Zhanguang Zhang Maedeh Aghaei Yaochen Hu and 14 more

Recent advancements in Large Language Models (LLMs) and Vision-Language (VLMs) have made them powerful tools embodied navigation, enabling agents to leverage commonsense spatial reasoning for efficient exploration unfamiliar environments. Existing LLM-based approaches convert global memory, such as semantic or topological maps, into language descriptions guide navigation. While this improves efficiency reduces redundant exploration, the loss of geometric information language-based...

10.48550/arxiv.2502.14254 preprint EN arXiv (Cornell University) 2025-02-19

MetaGrasp: Data Efficient Grasping by Affordance Interpreter Network

OPENALEX - Publications

Junhao Cai Hui Cheng Zhanpeng Zhang Jingcheng Su

Data-driven approach for grasping shows significant advance recently. But these approaches usually require much training data. To increase the efficiency of data collection, this paper presents a novel grasp system including whole pipeline from collection to model inference. The can collect effective sample with corrective strategy assisted by antipodal rule, and we design an affordance interpreter network predict pixelwise map. We define graspability, ungraspability background as...

10.1109/icra.2019.8793912 article EN 2022 International Conference on Robotics and Automation (ICRA) 2019-05-01

Multi-modality Latent Interaction Network for Visual Question Answering

OPENALEX - Publications

Peng Gao Haoxuan You Zhanpeng Zhang Xiaogang Wang Hongsheng Li

Exploiting relationships between visual regions and question words have achieved great success in learning multi-modality features for Visual Question Answering (VQA). However, we argue that existing methods mostly model relations individual words, which are not enough to correctly answer the question. From humans' perspective, answering a requires understanding summarizations of language information. In this paper, proposed Multi-modality Latent Interaction module (MLI) tackle problem. The...

10.48550/arxiv.1908.04289 preprint EN other-oa arXiv (Cornell University) 2019-01-01

FarSee-Net: Real-Time Semantic Segmentation by Efficient Multi-scale Context Aggregation and Feature Space Super-resolution

OPENALEX - Publications

Zhanpeng Zhang Kaipeng Zhang

Real-time semantic segmentation is desirable in many robotic applications with limited computation resources. One challenge of to deal the object scale variations and leverage context. How perform multi-scale context aggregation within budget important. In this paper, firstly, we introduce a novel efficient module called Cascaded Factorized Atrous Spatial Pyramid Pooling (CF-ASPP). It lightweight cas-caded structure for Convolutional Neural Networks (CNNs) efficiently information. On other...

10.1109/icra40945.2020.9196599 article EN 2020-05-01

Extraction and structural characterization of hydrolyzable tannins from Coriaria nepalensis leaves

OPENALEX - Publications

Linxin Guo Taotao Qiang Yvrui Yang Ying He Yi Dou and 3 more

10.1016/j.indcrop.2024.118646 article EN Industrial Crops and Products 2024-05-03

Learning Affordance Space in Physical World for Vision-based Robotic Object Manipulation

OPENALEX - Publications

Huadong Wu Zhanpeng Zhang Hui Cheng Kai Yang Jiaming Liu and 1 more

What is a proper representation for objects in manipulation? would human try to perceive when manipulating new object environment? In fact, instead of focusing on the texture and illumination, can infer "affordance" [36] from vision. Here describes object's intrinsic property that affords particular type manipulation. this work, we investigate whether such affordance be learned by deep neural network. particular, propose an Affordance Space Perception Network (ASPN) takes image as input...

10.1109/icra40945.2020.9196783 article EN 2020-05-01

CCAN: Constraint Co-Attention Network for Instance Grasping

OPENALEX - Publications

Junhao Cai Xuefeng Tao Hui Cheng Zhanpeng Zhang

Instance grasping is a challenging robotic task when robot aims to grasp specified target object in cluttered scenes. In this paper, we propose novel end-to-end instance method using only monocular workspace and query images, where the image includes several objects contains object. To effectively extract discriminative features facilitate training process, learning-based method, referred as Constraint Co-Attention Network (CCAN), proposed which consists of constraint co-attention module...

10.1109/icra40945.2020.9197182 article EN 2020-05-01

ULODNet: A Unified Lane and Obstacle Detection Network Towards Drivable Area Understanding in Autonomous Navigation

OPENALEX - Publications

Zhanpeng Zhang Jiahu Qin Shuai Wang Yu Kang Qingchen Liu

10.1007/s10846-022-01606-3 article EN Journal of Intelligent & Robotic Systems 2022-04-20

From Facial Expression Recognition to Interpersonal Relation Prediction

OPENALEX - Publications

Zhanpeng Zhang Ping Luo Chen Change Loy Xiaoou Tang

Interpersonal relation defines the association, e.g., warm, friendliness, and dominance, between two or more people. Motivated by psychological studies, we investigate if such fine-grained high-level traits can be characterized quantified from face images in wild. We address this challenging problem first studying a deep network architecture for robust recognition of facial expressions. Unlike existing models that typically learn expression labels alone, devise an effective multitask is...

10.48550/arxiv.1609.06426 preprint EN other-oa arXiv (Cornell University) 2016-01-01

Weakly supervised 6D pose estimation for robotic grasping

OPENALEX - Publications

Yaoxin Li Jinghua Sun Xiaoqian Li Zhanpeng Zhang Hui Cheng and 1 more

Learning based robotic grasping methods achieve substantial progress with the development of deep neural networks. However, requirement large-scale training data in real world limits application scopes these methods. Given 3D models target objects, we propose a new learning-based approach built on 6D object poses estimation from monocular RGB image. We aim to leverage both synthesized pose dataset and small scale real-world weakly labeled (e.g., mark number objects image), reduce system...

10.1145/3284398.3284408 article EN 2018-11-29

Domain centralization and cross-modal reinforcement learning for vision-based robotic manipulation

OPENALEX - Publications

Kai Yang Zhanpeng Zhang Hui Cheng Huadong Wu Ziying Guo

Vision-based robotic manipulation with deep learning method has achieved substantial advances in the field of automatic agriculture, which can be deployed and applied picking, sorting transporting agricultural products so on.Â Deep reinforcement (DRL) is one learning-methods that help robot learn policy itself by exploration exploitation.Â Training real robots DRL would take a great price limits its application scope.Â Some approaches train simulation deploy model to transferring images...

10.33440/j.ijpaa.20200302.77 article EN International Journal of Precision Agricultural Aviation 2018-01-01

Super-Identity Convolutional Neural Network for Face Hallucination

OPENALEX - Publications

Kaipeng Zhang Zhanpeng Zhang Cheng Chia-Wen H Hsu Winston Yu Qiao and 2 more

Face hallucination is a generative task to super-resolve the facial image with low resolution while human perception of face heavily relies on identity information. However, previous approaches largely ignore recovery. This paper proposes Super-Identity Convolutional Neural Network (SICNN) recover information for generating faces closed real identity. Specifically, we define super-identity loss measure difference between hallucinated and its corresponding high-resolution within hypersphere...

10.48550/arxiv.1811.02328 preprint EN other-oa arXiv (Cornell University) 2018-01-01

MetaGrasp: Data Efficient Grasping by Affordance Interpreter Network

OPENALEX - Publications

Junhao Cai Hui Cheng Zhanpeng Zhang Jingcheng Su

Data-driven approach for grasping shows significant advance recently. But these approaches usually require much training data. To increase the efficiency of data collection, this paper presents a novel grasp system including whole pipeline from collection to model inference. The can collect effective sample with corrective strategy assisted by antipodal rule, and we design an affordance interpreter network predict pixelwise map. We define graspability, ungraspability background as...

10.48550/arxiv.1902.06554 preprint EN other-oa arXiv (Cornell University) 2019-01-01

FarSee-Net: Real-Time Semantic Segmentation by Efficient Multi-scale Context Aggregation and Feature Space Super-resolution

OPENALEX - Publications

Zhanpeng Zhang Kaipeng Zhang

Real-time semantic segmentation is desirable in many robotic applications with limited computation resources. One challenge of to deal the object scale variations and leverage context. How perform multi-scale context aggregation within budget important. In this paper, firstly, we introduce a novel efficient module called Cascaded Factorized Atrous Spatial Pyramid Pooling (CF-ASPP). It lightweight cascaded structure for Convolutional Neural Networks (CNNs) efficiently information. On other...

10.48550/arxiv.2003.03913 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Grasping Novel Objects by Semi-supervised Domain Adaptation

OPENALEX - Publications

Junhao Cai Zhanpeng Zhang Hui Cheng

Learning-based robot arm grasping approach attracts increasing interests recently. The algorithm needs to accurately locate the point and angle. Existing methods usually require large amount of training data from physical robotic trial or synthetic samples simulation. system can show promising result with pre-defined objects, but performance may degrade for novel objects without annotation. Inspired by fact that we have a set pre-collected external source, only small quantity target...

10.1109/rcar47638.2019.9043954 article EN 2022 IEEE International Conference on Real-time Computing and Robotics (RCAR) 2019-08-01

Progress of Chinese and Western Medicine Clinical Research for Migraine

OPENALEX - Publications

Dequan Liu Xiaoju Wang Jingjing Chang Shaoning Zhao Zhanpeng Zhang and 1 more

Migrane is a common, chronic multifactorial disorders syndrome with multi-nervous system and non-nervous disorder.The pathogens of migrane are still unclear, mind, diet, endocrine, heredity have been considered to be attributed it.Pathogenesis therapeutic explored constantly.Now, an effective migraine modern medicine based on non-pharmacological treatment as well acute preventive medication.Traditional Chinese has developed day by day, which will establish new directions for migrainous...

10.2991/msetasse-16.2016.382 article EN cc-by-nc 2016-01-01

Fusing Object Context to Detect Functional Area for Cognitive Robots

OPENALEX - Publications

Hui Cheng Junhao Cai Quande Liu Zhanpeng Zhang Kai Yang and 2 more

A cognitive robot usually needs to perform multiple tasks in practice and locate the desired area for each task. Since deep learning has achieved substantial progress image recognition, solve this detection problem, it is straightforward label a functional (affordance) dataset apply well-trained deep-model-based classifier on all potential regions. However, annotating time consuming requirement of large amount training data limits application scope. We observe that are related surrounding...

10.1109/icra.2018.8460590 article EN 2018-05-01

Normalizing Chinese Disease Names with Multi-feature Fusion

OPENALEX - Publications

Pu Han Zhanpeng Zhang Mingtao Zhang Liang Gu

10.11925/infotech.2096-3467.2020.1211 article EN Shuju fenxi yu zhishi faxian 2021-05-27