NFDI4DS | UHH-SEMS - Publication Details

Jungong Han

ORCID: 0000-0003-4361-956X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5046605531

Research Areas

Advanced Image and Video Retrieval Techniques
Advanced Neural Network Applications
Domain Adaptation and Few-Shot Learning
Video Surveillance and Tracking Methods
Human Pose and Action Recognition
Multimodal Machine Learning Applications
Visual Attention and Saliency Detection
Advanced Vision and Imaging
Remote-Sensing Image Classification
Video Analysis and Summarization
Anomaly Detection Techniques and Applications
Image Retrieval and Classification Techniques
COVID-19 diagnosis using AI
Image Enhancement Techniques
Advanced Image Processing Techniques
Image Processing Techniques and Applications
Image and Signal Denoising Methods
Advanced Image Fusion Techniques
Gait Recognition and Analysis
Machine Learning and ELM
Face recognition and analysis
Face and Expression Recognition
Industrial Vision Systems and Defect Detection
Hand Gesture Recognition Systems
Adversarial Robustness in Machine Learning

University of Sheffield
2023-2025

Tsinghua University
2021-2025

University of Warwick
2019-2025

Aberystwyth University
2020-2024

Sichuan University
2024

Tencent (China)
2023

Anhui University of Technology
2022

Xidian University
2001-2021

Lancaster University
2017-2020

Beihang University
2018-2019

RepVGG: Making VGG-style ConvNets Great Again

OPENALEX - Publications

Xiaohan Ding Xiangyu Zhang Ningning Ma Jungong Han Guiguang Ding and 1 more

We present a simple but powerful architecture of convolutional neural network, which has VGG-like inference-time body composed nothing stack 3 × convolution and ReLU, while the training-time model multi-branch topology. Such decoupling is realized by structural re-parameterization technique so that named RepVGG. On ImageNet, RepVGG reaches over 80% top-1 accuracy, first time for plain model, to best our knowledge. NVIDIA 1080Ti GPU, models run 83% faster than ResNet-50 or 101% ResNet-101...

10.1109/cvpr46437.2021.01352 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Scaling Up Your Kernels to 31×31: Revisiting Large Kernel Design in CNNs

OPENALEX - Publications

Xiaohan Ding Xiangyu Zhang Jungong Han Guiguang Ding

We revisit large kernel design in modern convolutional neural networks (CNNs). Inspired by recent advances vision transformers (ViTs), this paper, we demonstrate that using a few kernels instead of stack small could be more powerful paradigm. suggested five guidelines, e.g., applying re-parameterized depthwise convolutions, to efficient high-performance large-kernel CNNs. Following the propose RepLKNet, pure CNN architecture whose size is as 31×31, contrast commonly used 3×3. RepLKNet...

10.1109/cvpr52688.2022.01166 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

ACNet: Strengthening the Kernel Skeletons for Powerful CNN via Asymmetric Convolution Blocks

OPENALEX - Publications

Xiaohan Ding Yuchen Guo Guiguang Ding Jungong Han

As designing appropriate Convolutional Neural Network (CNN) architecture in the context of a given application usually involves heavy human works or numerous GPU hours, research community is soliciting architecture-neutral CNN structures, which can be easily plugged into multiple mature architectures to improve performance on our real-world applications. We propose Asymmetric Convolution Block (ACB), an structure as building block, uses 1D asymmetric convolutions strengthen square...

10.1109/iccv.2019.00200 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

YOLOv10: Real-Time End-to-End Object Detection

OPENALEX - Publications

Ao Wang Hui Chen Lihao Liu Kai Chen Zijia Lin and 2 more

Over the past years, YOLOs have emerged as predominant paradigm in field of real-time object detection owing to their effective balance between computational cost and performance. Researchers explored architectural designs, optimization objectives, data augmentation strategies, others for YOLOs, achieving notable progress. However, reliance on non-maximum suppression (NMS) post-processing hampers end-to-end deployment adversely impacts inference latency. Besides, design various components...

10.48550/arxiv.2405.14458 preprint EN arXiv (Cornell University) 2024-05-23

IMRAM: Iterative Matching With Recurrent Attention Memory for Cross-Modal Image-Text Retrieval

OPENALEX - Publications

Hui Chen Guiguang Ding Xudong Liu Zijia Lin Ji Liu and 1 more

Enabling bi-directional retrieval of images and texts is important for understanding the correspondence between vision language. Existing methods leverage attention mechanism to explore such in a fine-grained manner. However, most them consider all semantics equally thus align uniformly, regardless their diverse complexities. In fact, are (i.e. involving different kinds semantic concepts), humans usually follow latent structure combine into understandable languages. It may be difficult...

10.1109/cvpr42600.2020.01267 preprint EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Diverse Branch Block: Building a Convolution as an Inception-like Unit

OPENALEX - Publications

Xiaohan Ding Xiangyu Zhang Jungong Han Guiguang Ding

We propose a universal building block of Convolutional Neural Network (ConvNet) to improve the performance without any inference-time costs. The is named Diverse Branch Block (DBB), which enhances representational capacity single convolution by combining diverse branches different scales and complexities enrich feature space, including sequences convolutions, multiscale average pooling. After training, DBB can be equivalently converted into conv layer for deployment. Unlike advancements...

10.1109/cvpr46437.2021.01074 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Cross-modality deep feature learning for brain tumor segmentation

OPENALEX - Publications

Dingwen Zhang Guohai Huang Qiang Zhang Jungong Han Junwei Han and 1 more

10.1016/j.patcog.2020.107562 article EN Pattern Recognition 2020-07-27

Gabor Convolutional Networks

OPENALEX - Publications

Shangzhen Luan Chen Chen Baochang Zhang Jungong Han Jianzhuang Liu

Steerable properties dominate the design of traditional filters, e.g., Gabor and endow features capability dealing with spatial transformations. However, such excellent have not been well explored in popular deep convolutional neural networks (DCNNs). In this paper, we propose a new model, termed Convolutional Networks (GCNs or CNNs), which incorporates filters into DCNNs to enhance resistance learned orientation scale changes. By only manipulating basic element based on i.e., convolution...

10.1109/tip.2018.2835143 article EN IEEE Transactions on Image Processing 2018-05-10

RGB-T Salient Object Detection via Fusing Multi-Level CNN Features

OPENALEX - Publications

Qiang Zhang Nianchang Huang Lin Yao Dingwen Zhang Caifeng Shan and 1 more

RGB-induced salient object detection has recently witnessed substantial progress, which is attributed to the superior feature learning capability of deep convolutional neural networks (CNNs). However, such detections suffer from challenging scenarios characterized by cluttered backgrounds, low-light conditions and variations in illumination. Instead improving RGB based saliency detection, this paper takes advantage complementary benefits thermal infrared images. Specifically, we propose a...

10.1109/tip.2019.2959253 article EN IEEE Transactions on Image Processing 2019-12-17

Centripetal SGD for Pruning Very Deep Convolutional Networks With Complicated Structure

OPENALEX - Publications

Xiaohan Ding Guiguang Ding Yuchen Guo Jungong Han

The redundancy is widely recognized in Convolutional Neural Networks (CNNs), which enables to remove some unimportant filters from convolutional layers so as slim the network with acceptable performance drop. Inspired by linearity of convolution, we seek make increasingly close and eventually identical for slimming. To this end, propose Centripetal SGD (C-SGD), a novel optimization method, can train several collapse into single point parameter hyperspace. When training completed, removal...

10.1109/cvpr.2019.00508 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Cross-View Retrieval via Probability-Based Semantics-Preserving Hashing

OPENALEX - Publications

Zijia Lin Guiguang Ding Jungong Han Jianmin Wang

For efficiently retrieving nearest neighbors from large-scale multiview data, recently hashing methods are widely investigated, which can substantially improve query speeds. In this paper, we propose an effective probability-based semantics-preserving (SePH) method to tackle the problem of cross-view retrieval. Considering semantic consistency between views, SePH generates one unified hash code for all observed views any instance. training, first transforms given affinities training data...

10.1109/tcyb.2016.2608906 article EN IEEE Transactions on Cybernetics 2016-09-29

Episode-Based Prototype Generating Network for Zero-Shot Learning

OPENALEX - Publications

Yunlong Yu Zhong Ji Jungong Han Zhongfei Zhang

We introduce a simple yet effective episode-based training framework for zero-shot learning (ZSL), where the system requires to recognize unseen classes given only corresponding class semantics. During training, model is trained within collection of episodes, each which designed simulate classification task. Through multiple progressively accumulates ensemble experiences on predicting mimetic classes, will generalize well real classes. Based this framework, we propose novel generative that...

10.1109/cvpr42600.2020.01405 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

FMCNet: Feature-Level Modality Compensation for Visible-Infrared Person Re-Identification

OPENALEX - Publications

Qiang Zhang Changzhou Lai Jianan Liu Nianchang Huang Jungong Han

For Visible-Infrared person ReIDentification (VI-ReID), existing modality-specific information compensation based models try to generate the images of missing modality from ones for reducing cross-modality discrepancy. However, because large discrepancy between visible and infrared images, generated usually have low qualities introduce much more interfering (e.g., color inconsistency). This greatly degrades subsequent VI-ReID performance. Alternatively, we present a novel Feature-level...

10.1109/cvpr52688.2022.00720 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting

OPENALEX - Publications

Xiaohan Ding Tianxiang Hao Jianchao Tan Ji Liu Jungong Han and 2 more

We propose ResRep, a novel method for lossless channel pruning (a.k.a. filter pruning), which slims down CNN by reducing the width (number of output channels) convolutional layers. Inspired neurobiology research about independence remembering and forgetting, we to re-parameterize into parts forgetting parts, where former learn maintain performance latter prune. Via training with regular SGD on but update rule penalty gradients latter, realize structured sparsity. Then equivalently merge...

10.1109/iccv48922.2021.00447 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Part-Object Relational Visual Saliency

OPENALEX - Publications

Yi Liu Dingwen Zhang Qiang Zhang Jungong Han

Recent years have witnessed a big leap in automatic visual saliency detection attributed to advances deep learning, especially Convolutional Neural Networks (CNNs). However, inferring the of each image part separately, as was adopted by most CNNs methods, inevitably leads an incomplete segmentation salient object. In this paper, we describe how use property part-object relations endowed Capsule Network (CapsNet) solve problems that fundamentally hinge on relational inference for detection....

10.1109/tpami.2021.3053577 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2021-01-01

Rep ViT: Revisiting Mobile CNN From ViT Perspective

OPENALEX - Publications

Ao Wang Hui Chen Zijia Lin Jungong Han Guiguang Ding

10.1109/cvpr52733.2024.01506 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

Memory Attention Networks for Skeleton-Based Action Recognition

OPENALEX - Publications

Ce Li Chunyu Xie Baochang Zhang Jungong Han Xiantong Zhen and 1 more

Skeleton-based action recognition has been extensively studied, but it remains an unsolved problem because of the complex variations skeleton joints in 3-D spatiotemporal space. To handle this issue, we propose a newly temporal-then-spatial recalibration method named memory attention networks (MANs) and deploy MANs using temporal module (TARM) convolution (STCM). In TARM, novel mechanism is built based on residual learning to recalibrate frames data temporally. STCM, recalibrated sequence...

10.1109/tnnls.2021.3061115 article EN IEEE Transactions on Neural Networks and Learning Systems 2021-03-15

ABMDRNet: Adaptive-weighted Bi-directional Modality Difference Reduction Network for RGB-T Semantic Segmentation

OPENALEX - Publications

Qiang Zhang Shenlu Zhao Yongjiang Luo Dingwen Zhang Nianchang Huang and 1 more

Semantic segmentation models gain robustness against poor lighting conditions by virtue of complementary information from visible (RGB) and thermal images. Despite its importance, most existing RGB-T semantic perform primitive fusion strategies, such as concatenation, element-wise summation weighted summation, to fuse features different modalities. These unfortunately, overlook the modality differences due imaging mechanisms, so that they suffer reduced discriminability fused features. To...

10.1109/cvpr46437.2021.00266 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Region-Object Relation-Aware Dense Captioning via Transformer

OPENALEX - Publications

Zhuang Shao Jungong Han Demetris Marnerides Kurt Debattista

Dense captioning provides detailed captions of complex visual scenes. While a number successes have been achieved in recent years, there are still two broad limitations: 1) most existing methods adopt an encoder-decoder framework, where the contextual information is sequentially encoded using long short-term memory (LSTM). However, forget gate mechanism LSTM makes it vulnerable when dealing with sequence and 2) vast majority prior arts consider regions interests (RoIs) equally important,...

10.1109/tnnls.2022.3152990 article EN publisher-specific-oa IEEE Transactions on Neural Networks and Learning Systems 2022-03-11

Textual Context-Aware Dense Captioning With Diverse Words

OPENALEX - Publications

Zhuang Shao Jungong Han Kurt Debattista Yanwei Pang

Dense captioning generates more detailed spoken descriptions for complex visual scenes. Despite several promising leads, existing methods still have two broad limitations: 1) The vast majority of prior arts only consider contextual clues during but ignore potentially important textual context; 2) current imbalanced learning mechanisms limit the diversity vocabulary learned from dictionary, thus giving rise to low language-learning efficiency. To alleviate these gaps, in this paper, we...

10.1109/tmm.2023.3241517 article EN IEEE Transactions on Multimedia 2023-01-01

Capsule Networks With Residual Pose Routing

OPENALEX - Publications

Yi Liu De Cheng Dingwen Zhang Shoukun Xu Jungong Han

Capsule networks (CapsNets) have been known difficult to develop a deeper architecture, which is desirable for high performance in the deep learning era, due complex capsule routing algorithms. In this article, we present simple yet effective algorithm, presented by residual pose routing. Specifically, higher-layer achieved an identity mapping on adjacently lower-layer pose. Such has two advantages: 1) reducing computation complexity and 2) avoiding gradient vanishing its framework. On top...

10.1109/tnnls.2023.3347722 article EN IEEE Transactions on Neural Networks and Learning Systems 2024-01-09

Virtual Category Learning: A Semi-Supervised Learning Method for Dense Prediction With Extremely Limited Labels

OPENALEX - Publications

Changrui Chen Jungong Han Kurt Debattista

This is a repository copy of Virtual category learning: semi-supervised learning method for dense prediction with extremely limited labels.

10.1109/tpami.2024.3367416 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-02-20

Coming Soon ...