NFDI4DS | UHH-SEMS - Publication Details

MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features

OPENALEX - Publications

Liang-Chieh Chen Alexander Hermans George Papandreou Florian Schroff Peng Wang and 1 more

In this work, we tackle the problem of instance segmentation, task simultaneously solving object detection and semantic segmentation. Towards goal, present a model, called MaskLab, which produces three outputs: box detection, direction prediction. Building on top Faster-RCNN detector, predicted boxes provide accurate localization instances. Within each region interest, MaskLab performs foreground/background segmentation by combining Semantic assists model in distinguishing between objects...

10.1109/cvpr.2018.00422 article EN 2018-06-01

Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition

OPENALEX - Publications

Hui Li Peng Wang Chunhua Shen Guyu Zhang

Recognizing irregular text in natural scene images is challenging due to the large variance appearance, such as curvature, orientation and distortion. Most existing approaches rely heavily on sophisticated model designs and/or extra fine-grained annotations, which, some extent, increase difficulty algorithm implementation data collection. In this work, we propose an easy-to-implement strong baseline for recognition, using offthe-shelf neural network components only word-level annotations. It...

10.1609/aaai.v33i01.33018610 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

NAS-FCOS: Fast Neural Architecture Search for Object Detection

OPENALEX - Publications

Ning Wang Yang Gao Hao Chen Peng Wang Zhi Tian and 2 more

The success of deep neural networks relies on significant architecture engineering. Recently search (NAS) has emerged as a promise to greatly reduce manual effort in network design by automatically searching for optimal architectures, although typically such algorithms need an excessive amount computational resources, e.g., few thousand GPU-days. To date, challenging vision tasks object detection, NAS, especially fast versions is less studied. Here we propose the decoder structure detectors...

10.1109/cvpr42600.2020.01196 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Global Mask R-CNN for marine ship instance segmentation

OPENALEX - Publications

Yuxin Sun Li Su Yongkang Luo Hao Meng Wanyi Li and 3 more

10.1016/j.neucom.2022.01.017 article EN Neurocomputing 2022-01-22

HR-NAS: Searching Efficient High-Resolution Neural Architectures with Lightweight Transformers

OPENALEX - Publications

Mingyu Ding Xiaochen Lian Linjie Yang Peng Wang Xiaojie Jin and 2 more

High-resolution representations (HR) are essential for dense prediction tasks such as segmentation, detection, and pose estimation. Learning HR is typically ignored in previous Neural Architecture Search (NAS) methods that focus on image classification. This work proposes a novel NAS method, called HR-NAS, which able to find efficient accurate networks different tasks, by effectively encoding multiscale contextual information while maintaining high-resolution representations. In we renovate...

10.1109/cvpr46437.2021.00300 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

End-to-End Modeling via Information Tree for One-Shot Natural Language Spatial Video Grounding

OPENALEX - Publications

Mengze Li Tianbao Wang Haoyu Zhang Shengyu Zhang Zhou Zhao and 7 more

Mengze Li, Tianbao Wang, Haoyu Zhang, Shengyu Zhou Zhao, Jiaxu Miao, Wenqiao Wenming Tan, Jin Peng Shiliang Pu, Fei Wu. Proceedings of the 60th Annual Meeting Association for Computational Linguistics (Volume 1: Long Papers). 2022.

10.18653/v1/2022.acl-long.596 article EN cc-by Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2022-01-01

NAS-FCOS: Efficient Search for Object Detection Architectures

OPENALEX - Publications

Ning Wang Yang Gao Hao Chen Peng Wang Zhi Tian and 2 more

10.1007/s11263-021-01523-2 article EN International Journal of Computer Vision 2021-10-15

Mitigating biases in long-tailed recognition via semantic-guided feature transfer

OPENALEX - Publications

Sheng Shi Peng Wang Xinfeng Zhang Jianping Fan

10.1016/j.neucom.2024.127735 article EN Neurocomputing 2024-04-23

An Adaptive Correlation Filtering Method for Text-Based Person Search

OPENALEX - Publications

Mengyang Sun Wei Suo Peng Wang Kai Niu Le Liu and 3 more

10.1007/s11263-024-02094-8 article EN International Journal of Computer Vision 2024-05-16

MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval

OPENALEX - Publications

Xiaojie Jin Bowen Zhang Weibo Gong Kai Xu Xueqing Deng and 4 more

10.1109/cvpr52733.2024.02563 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

NAS-FCOS: Fast Neural Architecture Search for Object Detection

OPENALEX - Publications

Ning Wang Yang Gao Hao Chen Peng Wang Zhi Tian and 2 more

The success of deep neural networks relies on significant architecture engineering. Recently search (NAS) has emerged as a promise to greatly reduce manual effort in network design by automatically searching for optimal architectures, although typically such algorithms need an excessive amount computational resources, e.g., few thousand GPU-days. To date, challenging vision tasks object detection, NAS, especially fast versions is less studied. Here we propose the decoder structure detectors...

10.48550/arxiv.1906.04423 preprint EN other-oa arXiv (Cornell University) 2019-01-01

Accurate Camouflaged Object Detection via Mixture Convolution and Interactive Fusion

OPENALEX - Publications

Geng Chen Xinrui Chen Bo Dong Mingchen Zhuge Yongxiong Wang and 4 more

Camouflaged object detection (COD), which aims to identify the objects that conceal themselves into surroundings, has recently drawn increasing research efforts in field of computer vision. In practice, success deep learning based COD is mainly determined by two key factors, including (i) A significantly large receptive field, provides rich context information, and (ii) An effective fusion strategy, aggregates multi-level features for accurate COD. Motivated these observations, this paper,...

10.48550/arxiv.2101.05687 preprint EN other-oa arXiv (Cornell University) 2021-01-01

MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features

OPENALEX - Publications

Liang-Chieh Chen Alexander Hermans George Papandreou Florian Schroff Peng Wang and 1 more

In this work, we tackle the problem of instance segmentation, task simultaneously solving object detection and semantic segmentation. Towards goal, present a model, called MaskLab, which produces three outputs: box detection, direction prediction. Building on top Faster-RCNN detector, predicted boxes provide accurate localization instances. Within each region interest, MaskLab performs foreground/background segmentation by combining Semantic assists model in distinguishing between objects...

10.48550/arxiv.1712.04837 preprint EN other-oa arXiv (Cornell University) 2017-01-01

GroupNet: Learning to group corner for object detection in remote sensing imagery

OPENALEX - Publications

Lei Ni Chunlei Huo Xin Zhang Peng Wang Zhixin Zhou

Due to the attractive potential in avoiding elaborate definition of anchor attributes, anchor-free-based deep learning approaches are promising for object detection remote sensing imagery. CornerNet is one most representative methods approaches. However, it can be observed distinctly from visual inspection that limited grouping keypoints, which significantly impacts performance. To address above problem, a novel and effective approach, called GroupNet, presented this paper, adaptively groups...

10.1016/j.cja.2021.09.016 article EN cc-by-nc-nd Chinese Journal of Aeronautics 2021-10-26

Implementation of Occluded Pedestrian Detection Method Based on Improved YOLOv5 in ROS Platform

OPENALEX - Publications

Qingsen Su Guangliang Liu Yanfang Zhang Peng Wang Yali Wang and 1 more

In this paper, we propose an improved YOLOv5 pedestrian detection algorithm to solve the problems of target missing and low accuracy in ROS platform. By adding a small layer 160*160, method improves performance model effectively reduces false rate occluded pedestrians, especially heavily targets. order further improve accuracy, it fuses underlying features backbone network achieve path aggregation with multi-feature fusion. Furthermore, Soft-DIoU-NMS is used for post-detection processing...

10.1109/icras57898.2023.10221598 article EN 2023-06-16

Ground-to-Aerial Person Search: Benchmark Dataset and Approach

OPENALEX - Publications

Shizhou Zhang Qingchun Yang De Cheng Yinghui Xing Guoqiang Liang and 2 more

In this work, we construct a large-scale dataset for Ground-to-Aerial Person Search, named G2APS, which contains 31,770 images of 260,559 annotated bounding boxes 2,644 identities appearing in both the UAVs and ground surveillance cameras. To our knowledge, is first cross-platform intelligent applications, where could work as powerful complement more realistically simulate actual scenarios, cameras are fixed about 2 meters above ground, while capture videos persons at different location,...

10.1145/3581783.3612105 preprint EN 2023-10-26

COCONut: Modernizing COCO Segmentation

OPENALEX - Publications

Xueqing Deng Qihang Yu Peng Wang Xiaohui Shen Liang-Chieh Chen

In recent decades, the vision community has witnessed remarkable progress in visual recognition, partially owing to advancements dataset benchmarks. Notably, established COCO benchmark propelled development of modern detection and segmentation systems. However, seen comparatively slow improvement over last decade. Originally equipped with coarse polygon annotations for thing instances, it gradually incorporated superpixel stuff regions, which were subsequently heuristically amalgamated yield...

10.48550/arxiv.2404.08639 preprint EN arXiv (Cornell University) 2024-04-12

Depth Priors in Removal Neural Radiance Fields

OPENALEX - Publications

Zhihao Guo Peng Wang

Neural Radiance Fields (NeRF) have shown impressive results in 3D reconstruction and generating novel views. A key challenge within NeRF is the editing of reconstructed scenes, such as object removal, which requires maintaining consistency across multiple views ensuring high-quality synthesised perspectives. Previous studies incorporated depth priors, typically from LiDAR or sparse measurements provided by COLMAP, to improve performance removal NeRF. However, these methods are either costly...

10.48550/arxiv.2405.00630 preprint EN arXiv (Cornell University) 2024-05-01

FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression Comprehension

OPENALEX - Publications

Junzhuo Liu Xingyi Yang Wei Li Peng Wang

Referring Expression Comprehension (REC) is a crucial cross-modal task that objectively evaluates the capabilities of language understanding, image comprehension, and language-to-image grounding. Consequently, it serves as an ideal testing ground for Multi-modal Large Language Models (MLLMs). In pursuit this goal, we have established new REC dataset characterized by two key features: Firstly, designed with controllable varying levels difficulty, necessitating multi-level fine-grained...

10.48550/arxiv.2409.14750 preprint EN arXiv (Cornell University) 2024-09-23

Weakly Supervised Video Anomaly Detection and Localization with Spatio-Temporal Prompts

OPENALEX - Publications

Peng Wu Xuerong Zhou Guansong Pang Zhiwei Yang Qingsen Yan and 2 more

10.1145/3664647.3681442 article EN 2024-10-26

A Plug-and-Play Method for Rare Human-Object Interactions Detection by Bridging Domain Gap

OPENALEX - Publications

Lijun Zhang Wei Suo Peng Wang Yanning Zhang

10.1145/3664647.3680666 article EN 2024-10-26

FineCops-Ref: A new Dataset and Task for Fine-Grained Compositional Referring Expression Comprehension

OPENALEX - Publications

Junzhuo Liu Xingyi Yang Wei Li Peng Wang

10.18653/v1/2024.emnlp-main.864 article EN Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing 2024-01-01