NFDI4DS | UHH-SEMS - Publication Details

TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios

OPENALEX - Publications

Xingkui Zhu Shuchang Lyu Xu Wang Qi Zhao

Object detection on drone-captured scenarios is a recent popular task. As drones always navigate in different altitudes, the object scale varies violently, which burdens optimization of networks. Moreover, high-speed and low-altitude flight bring motion blur densely packed objects, leads to great challenge distinction. To solve two issues mentioned above, we propose TPH-YOLOv5. Based YOLOv5, add one more prediction head detect different-scale objects. Then replace original heads with...

10.1109/iccvw54120.2021.00312 article EN 2021-10-01

VisDrone-DET2021: The Vision Meets Drone Object detection Challenge Results

OPENALEX - Publications

Yaru Cao Zhijian He Lujia Wang Wenguan Wang Yixuan Yuan and 32 more

Object detection on the drone faces a great diversity of challenges such as small object inference, background clutter and wide viewpoint. In contrast to traditional problem in computer vision, bird-like angle can not be transplanted directly from common-in-use methods due special texture sky's view. However, lack comprehensive data set, number algorithms that focus using captured by drones is limited. So VisDrone team gathered massive set organized Vision Meets Drones: A Challenge...

10.1109/iccvw54120.2021.00319 article EN 2021-10-01

TPH-YOLOv5++: Boosting Object Detection on Drone-Captured Scenarios with Cross-Layer Asymmetric Transformer

OPENALEX - Publications

Qi Zhao Binghao Liu Shuchang Lyu Chunlei Wang Hong Zhang

Object detection in drone-captured images is a popular task recent years. As drones always navigate at different altitudes, the object scale varies considerably, which burdens optimization of models. Moreover, high-speed and low-altitude flight cause motion blur on densely packed objects, leads to great challenges. To solve two issues mentioned above, based YOLOv5, we add an additional prediction head detect tiny-scale objects replace CNN-based heads with transformer (TPH), constructing...

10.3390/rs15061687 article EN cc-by Remote Sensing 2023-03-21

Self-training guided disentangled adaptation for cross-domain remote sensing image semantic segmentation

OPENALEX - Publications

Qi Zhao Shuchang Lyu Hongbo Zhao Binghao Liu Lijiang Chen and 1 more

Remote sensing (RS) image semantic segmentation using deep convolutional neural networks (DCNNs) has shown great success in various applications. However, the high dependence on annotated data makes it challenging for DCNNs to adapt different RS scenes. To address this challenge, we propose a cross-domain task that considers ground sampling distance, remote sensor variation, and geographical landscapes as main factors causing domain shifts between source target images. mitigate negative...

10.1016/j.jag.2023.103646 article EN cc-by-nc-nd International Journal of Applied Earth Observation and Geoinformation 2024-01-05

MGML: Multigranularity Multilevel Feature Ensemble Network for Remote Sensing Scene Classification

OPENALEX - Publications

Qi Zhao Shuchang Lyu Yuewen Li Yujing Ma Lijiang Chen

Remote sensing (RS) scene classification is a challenging task to predict categories of RS images. images have two main issues: large intraclass variance caused by resolution and confusing information from geographic covering area. To ease the negative influence above issues. We propose multigranularity multilevel feature ensemble network (MGML-FENet) efficiently tackle in this article. Specifically, we fusion branch (MGML-FFB) extract features different levels channel-separate generator...

10.1109/tnnls.2021.3106391 article EN cc-by-nc-nd IEEE Transactions on Neural Networks and Learning Systems 2021-09-01

A Self-Distillation Embedded Supervised Affinity Attention Model for Few-Shot Segmentation

OPENALEX - Publications

Qi Zhao Binghao Liu Shuchang Lyu Huojin Chen

Few-shot segmentation focuses on the generalization of models to segment unseen object with limited annotated samples. However, existing approaches still face two main challenges. First, huge feature distinction between support and query images causes knowledge transferring barrier, which harms performance. Second, prototypes cannot adequately represent features objects, hard guide high-quality segmentation. To deal above issues, we propose self-distillation embedded supervised affinity...

10.1109/tcds.2023.3251371 article EN cc-by IEEE Transactions on Cognitive and Developmental Systems 2023-03-02

OV-VG: A benchmark for open-vocabulary visual grounding

OPENALEX - Publications

Chunlei Wang Wenquan Feng Xiangtai Li Guangliang Cheng Shuchang Lyu and 3 more

Open-vocabulary learning has emerged as a cutting-edge research area, particularly in light of the widespread adoption vision-based foundational models. Its primary objective is to comprehend novel concepts that are not encompassed within predefined vocabulary. One key facet this endeavor Visual Grounding (VG), which entails locating specific region an image based on corresponding language description. While current models excel at various visual tasks, there noticeable absence specifically...

10.1016/j.neucom.2024.127738 article EN cc-by-nc Neurocomputing 2024-04-24

AlphaMEX: A smarter global pooling method for convolutional neural networks

OPENALEX - Publications

Boxue Zhang Qi Zhao Wenquan Feng Shuchang Lyu

Deep convolutional neural networks have achieved great success on image classification. A series of feature extractors learned from CNN been used in many computer vision tasks. Global pooling layer plays a very important role deep networks. It is found that the input feature-maps global become sparse, as increasing use Batch Normalization and ReLU combination, which makes original low efficiency. In this paper, we proposed novel end-to-end trainable operator AlphaMEX Pool for network....

10.1016/j.neucom.2018.07.079 article EN cc-by-nc-nd Neurocomputing 2018-09-13

A feature consistency driven attention erasing network for fine-grained image retrieval

OPENALEX - Publications

Qi Zhao Xu Wang Shuchang Lyu Binghao Liu Y. F. Yang

10.1016/j.patcog.2022.108618 article EN Pattern Recognition 2022-03-02

Embedded Self-Distillation in Compact Multibranch Ensemble Network for Remote Sensing Scene Classification

OPENALEX - Publications

Qi Zhao Yujing Ma Shuchang Lyu Lijiang Chen

Remote sensing (RS) image scene classification task faces many challenges due to the interference from different characteristics of geographical elements. To solve this problem, we propose a multi-branch ensemble network enhance feature representation ability by fusing features in final output logits and intermediate maps. However, simply adding branches will increase complexity models decline inference efficiency. On issue, embed self-distillation (SD) method transfer knowledge main-branch...

10.1109/tgrs.2021.3126770 article EN cc-by IEEE Transactions on Geoscience and Remote Sensing 2021-11-08

Learn by Oneself: Exploiting Weight-Sharing Potential in Knowledge Distillation Guided Ensemble Network

OPENALEX - Publications

Qi Zhao Shuchang Lyu Lijiang Chen Binghao Liu Ting-Bing Xu and 2 more

Recent CNNs (convolutional neural networks) have become more and compact. The elegant structure design highly improves the performance of CNNs. With development knowledge distillation technique, gets further improved. However, existing guided methods either rely on offline pretrained high-quality large teacher models or online heavy training burden. To solve above problems, we propose a feature-sharing weight-sharing based ensemble network (training framework) by (EKD-FWSNet) to make...

10.1109/tcsvt.2023.3267115 article EN cc-by IEEE Transactions on Circuits and Systems for Video Technology 2023-04-14

SWIN-TOD: Smooth Wasserstein Distance and Instance-level Neighboring Enhancement for Remote Sensing Tiny Object Detection

OPENALEX - Publications

Guangbiao Wang Hongbo Zhao Shuchang Lyu Guangliang Cheng Qing Chang and 3 more

10.1109/tgrs.2024.3452010 article EN cc-by-nc-nd IEEE Transactions on Geoscience and Remote Sensing 2024-01-01

Multiactivation Pooling Method in Convolutional Neural Networks for Image Recognition

OPENALEX - Publications

Qi Zhao Shuchang Lyu Boxue Zhang Wenquan Feng

Convolutional neural networks (CNNs) are becoming more and popular today. CNNs now have become a feature extractor applying to image processing, big data fog computing, etc. usually consist of several basic units like convolutional unit, pooling activation so on. In CNNs, conventional methods refer 2×2 max‐pooling average‐pooling, which applied after the or ReLU layers. this paper, we propose Multiactivation Pooling (MAP) Method make accurate on classification tasks without increasing depth...

10.1155/2018/8196906 article EN cc-by Wireless Communications and Mobile Computing 2018-01-01

VALNet: Vision-Based Autonomous Landing with Airport Runway Instance Segmentation

OPENALEX - Publications

Wang Qiang Wenquan Feng Hongbo Zhao Binghao Liu Shuchang Lyu

Visual navigation, characterized by its autonomous capabilities, cost effectiveness, and robust resistance to interference, serves as the foundation for vision-based landing systems. These systems rely heavily on runway instance segmentation, which accurately divides areas provides precise information unmanned aerial vehicle (UAV) navigation. However, current research primarily focuses detection but lacks relevant segmentation datasets. To address this gap, we created Runway Landing Dataset...

10.3390/rs16122161 article EN cc-by Remote Sensing 2024-06-14

TPH-YOLOv5: Improved YOLOv5 Based on Transformer Prediction Head for Object Detection on Drone-captured Scenarios

OPENALEX - Publications

Xingkui Zhu Shuchang Lyu Xu Wang Qi Zhao

Object detection on drone-captured scenarios is a recent popular task. As drones always navigate in different altitudes, the object scale varies violently, which burdens optimization of networks. Moreover, high-speed and low-altitude flight bring motion blur densely packed objects, leads to great challenge distinction. To solve two issues mentioned above, we propose TPH-YOLOv5. Based YOLOv5, add one more prediction head detect different-scale objects. Then replace original heads with...

10.48550/arxiv.2108.11539 preprint EN other-oa arXiv (Cornell University) 2021-01-01

A CNN-SIFT Hybrid Pedestrian Navigation Method Based on First-Person Vision

OPENALEX - Publications

Qi Zhao Boxue Zhang Shuchang Lyu Hong Zhang Daniel Sun and 2 more

The emergence of new wearable technologies, such as action cameras and smart glasses, has driven the use first-person perspective in computer applications. This field is now attracting attention investment researchers aiming to develop methods process vision (FPV) video. current approaches present particular combinations different image features quantitative accomplish specific objectives, object detection, activity recognition, user–machine interaction, etc. FPV-based navigation necessary...

10.3390/rs10081229 article EN cc-by Remote Sensing 2018-08-05

WEA-DINO: An Improved DINO With Word Embedding Alignment for Remote Scene Zero-Shot Object Detection

OPENALEX - Publications

Guangbiao Wang Hongbo Zhao Qing Chang Shuchang Lyu Guangliang Cheng and 1 more

10.1109/lgrs.2024.3408875 article EN IEEE Geoscience and Remote Sensing Letters 2024-01-01

Interpretable Relative Squeezing bottleneck design for compact convolutional neural networks model

OPENALEX - Publications

Qi Zhao Jiahui Liu Boxue Zhang Shuchang Lyu Nauman Raoof and 1 more

Convolutional neural networks (CNN) are mainly used for image recognition tasks. However, some huge models infeasible mobile devices because of limited computing and memory resources. In this paper, feature maps DenseNet CondenseNet visualized. It could be observed that there channels in locked state have similar distribution property, which compressed further. Thus, work, a novel architecture — RSNet is introduced to improve the efficiency CNNs. This paper proposes Relative-Squeezing (RS)...

10.1016/j.imavis.2019.06.006 article EN cc-by-nc-nd Image and Vision Computing 2019-07-07

Detection Method of Infected Wood on Digital Orthophoto Map–Digital Surface Model Fusion Network

OPENALEX - Publications

Guangbiao Wang Hongbo Zhao Qing Chang Shuchang Lyu Binghao Liu and 2 more

Pine wilt disease (PWD) is a worldwide affliction that poses significant menace to forest ecosystems. The swift and precise identification of pine trees under infection holds paramount significance in the proficient administration this ailment. progression remote sensing deep learning methodologies has propelled utilization target detection recognition techniques reliant on imagery, emerging as prevailing strategy for pinpointing affected trees. Although existing object algorithms have...

10.3390/rs15174295 article EN cc-by Remote Sensing 2023-08-31

Iterative Robust Visual Grounding with Masked Reference based Centerpoint Supervision

OPENALEX - Publications

Menghao Li Chunlei Wang Wenquan Feng Shuchang Lyu Guangliang Cheng and 3 more

Visual Grounding (VG) aims at localizing target objects from an image based on given expressions and has made significant progress with the development of detection vision transformer. However, existing VG methods tend to generate false-alarm when presented inaccurate or irrelevant descriptions, which commonly occur in practical applications. Moreover, fail capture fine-grained features, accurate localization, sufficient context comprehension whole textual descriptions. To address both...

10.1109/iccvw60793.2023.00501 article EN 2023-10-02

MMOTU: A Multi-Modality Ovarian Tumor Ultrasound Image Dataset for Unsupervised Cross-Domain Semantic Segmentation

OPENALEX - Publications

Qi Zhao Shuchang Lyu Wenpei Bai Linghan Cai Binghao Liu and 4 more

Ovarian cancer is one of the most harmful gynecological diseases. Detecting ovarian tumors in early stage with computer-aided techniques can efficiently decrease mortality rate. With improvement medical treatment standard, ultrasound images are widely applied clinical treatment. However, recent notable methods mainly focus on single-modality tumor segmentation or recognition, which means there still lacks researches exploring representation capability multi-modality images. To solve this...

10.48550/arxiv.2207.06799 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Feature reconstruction and metric based network for few-shot object detection

OPENALEX - Publications

Yuewen Li Wenquan Feng Shuchang Lyu Qi Zhao

In the object detection task, deep learning-based methods always need a large amount of annotated training data. However, annotating number images is labor-intensive. order to reduce dependency expensive annotations, we propose novel end-to-end feature reconstruction and metric based network for few-shot (FM-FSOD). FM-FSOD integrates learning meta-learning tackle task. class-agnostic model that can accurately recognize categories without fine-tuning on categories. Specifically, quickly learn...

10.1016/j.cviu.2022.103600 article EN cc-by-nc-nd Computer Vision and Image Understanding 2022-11-24