NFDI4DS | UHH-SEMS - Publication Details

Yongjian Wu

ORCID: 0000-0003-2007-6929

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100716576

Research Areas

Advanced Neural Network Applications
Advanced Image and Video Retrieval Techniques
Domain Adaptation and Few-Shot Learning
Multimodal Machine Learning Applications
Video Surveillance and Tracking Methods
Adversarial Robustness in Machine Learning
Anomaly Detection Techniques and Applications
Image Retrieval and Classification Techniques
Human Pose and Action Recognition
Generative Adversarial Networks and Image Synthesis
Video Analysis and Summarization
Face recognition and analysis
Medical Image Segmentation Techniques
Natural Language Processing Techniques
Advanced Image Processing Techniques
Gait Recognition and Analysis
AI in cancer detection
Machine Learning and ELM
Topic Modeling
Advanced Data Compression Techniques
Image Processing Techniques and Applications
Radiomics and Machine Learning in Medical Imaging
Machine Learning and Data Classification
Image Enhancement Techniques
Explainable Artificial Intelligence (XAI)

Tencent (China)
2015-2024

Shandong Academy of Sciences
2023

Qilu University of Technology
2023

Beihang University
2023

State Key Laboratory of Software Development Environment
2023

Tsinghua University
2023

City University of Hong Kong
2023

Shandong University
2019-2022

Xiamen University
2018-2021

Artificial Intelligence in Medicine (Canada)
2021

Dual-level Collaborative Transformer for Image Captioning

OPENALEX - Publications

Yunpeng Luo Jiayi Ji Xiaoshuai Sun Liujuan Cao Yongjian Wu and 3 more

Descriptive region features extracted by object detection networks have played an important role in the recent advancements of image captioning. However, they are still criticized for lack contextual information and fine-grained details, which contrast merits traditional grid features. In this paper, we introduce a novel Dual-Level Collaborative Transformer (DLCT) network to realize complementary advantages two Concretely, DLCT, these first processed Dual-way Self Attenion (DWSA) mine their...

10.1609/aaai.v35i3.16328 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Cross-Modality Binary Code Learning via Fusion Similarity Hashing

OPENALEX - Publications

Hong Liu Rongrong Ji Yongjian Wu Feiyue Huang Baochang Zhang

Binary code learning has been emerging topic in large-scale cross-modality retrieval recently. It aims to map features from multiple modalities into a common Hamming space, where the similarity can be approximated efficiently via distance. To this end, most existing works learn binary codes directly data instances modalities, which preserve both intra-and inter-modal similarities respectively. Few methods consider fusion among multi-modal instead, explicitly capture their heterogeneous...

10.1109/cvpr.2017.672 article EN 2017-07-01

Accelerating Convolutional Networks via Global & Dynamic Filter Pruning

OPENALEX - Publications

Shaohui Lin Rongrong Ji Yuchao Li Yongjian Wu Feiyue Huang and 1 more

Accelerating convolutional neural networks has recently received ever-increasing research focus. Among various approaches proposed in the literature, filter pruning been regarded as a promising solution, which is due to its advantage significant speedup and memory reduction of both network model intermediate feature maps. To this end, most tend prune filters layer-wise fixed manner, incapable dynamically recover previously removed filter, well jointly optimize pruned across layers. In paper,...

10.24963/ijcai.2018/336 article EN 2018-07-01

Channel Pruning via Automatic Structure Search

OPENALEX - Publications

Mingbao Lin Rongrong Ji Yuxin Zhang Baochang Zhang Yongjian Wu and 1 more

Channel pruning is among the predominant approaches to compress deep neural networks. To this end, most existing methods focus on selecting channels (filters) by importance/optimization or regularization based rule-of-thumb designs, which defects in sub-optimal pruning. In paper, we propose a new channel method artificial bee colony algorithm (ABC), dubbed as ABCPruner, aims efficiently find optimal pruned structure, i.e., number each layer, rather than "important" previous works did. solve...

10.24963/ijcai.2020/94 article EN 2020-07-01

RSTNet: Captioning with Adaptive Attention on Visual and Non-Visual Words

OPENALEX - Publications

Xuying Zhang Xiaoshuai Sun Yunpeng Luo Jiayi Ji Yiyi Zhou and 3 more

Recent progress on visual question answering has explored the merits of grid features for vision language tasks. Meanwhile, transformer-based models have shown remarkable performance in various sequence prediction problems. However, spatial information loss caused by flattening operation, as well defect transformer model distinguishing words and non words, are still left unexplored. In this paper, we first propose Grid-Augmented (GA) module, which relative geometry between grids incorporated...

10.1109/cvpr46437.2021.01521 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Discover Cross-Modality Nuances for Visible-Infrared Person Re-Identification

OPENALEX - Publications

Qiong Wu Pingyang Dai Jie Chen Chia‐Wen Lin Yongjian Wu and 3 more

Visible-infrared person re-identification (Re-ID) aims to match the pedestrian images of same identity from different modalities. Existing works mainly focus on alleviating modality discrepancy by aligning distributions features However, nuanced but discriminative information, such as glasses, shoes, and length clothes, has not been fully explored, especially in infrared modality. Without discovering nuances, it is challenging pedestrians across modalities using alignment solely, which...

10.1109/cvpr46437.2021.00431 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Improving Image Captioning by Leveraging Intra- and Inter-layer Global Representation in Transformer Network

OPENALEX - Publications

Jiayi Ji Yunpeng Luo Xiaoshuai Sun Fuhai Chen Gen Luo and 3 more

Transformer-based architectures have shown great success in image captioning, where object regions are encoded and then attended into the vectorial representations to guide caption decoding. However, such only contain region-level information without considering global reflecting entire image, which fails expand capability of complex multi-modal reasoning captioning. In this paper, we introduce a Global Enhanced Transformer (termed GET) enable extraction more comprehensive representation,...

10.1609/aaai.v35i2.16258 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2021-05-18

Image-to-image Translation via Hierarchical Style Disentanglement

OPENALEX - Publications

Xinyang Li Shengchuan Zhang Jie Hu Liujuan Cao Xiaopeng Hong and 4 more

Recently, image-to-image translation has made significant progress in achieving both multi-label (i.e., conditioned on different labels) and multi-style generation with diverse styles) tasks. However, due to the unexplored independence exclusiveness labels, existing endeavors are defeated by involving uncontrolled manipulations results. In this paper, we propose Hierarchical Style Disentanglement (HiSD) address issue. Specifically, organize labels into a hierarchical tree structure, which...

10.1109/cvpr46437.2021.00853 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Exploiting Kernel Sparsity and Entropy for Interpretable CNN Compression

OPENALEX - Publications

Yuchao Li Shaohui Lin Baochang Zhang Jianzhuang Liu David Doermann and 3 more

Compressing convolutional neural networks (CNNs) has received ever-increasing research focus. However, most existing CNN compression methods do not interpret their inherent structures to distinguish the implicit redundancy. In this paper, we investigate problem of from a novel interpretable perspective. The relationship between input feature maps and 2D kernels is revealed in theoretical framework, based on which kernel sparsity entropy (KSE) indicator proposed quantitate map importance...

10.1109/cvpr.2019.00291 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation

OPENALEX - Publications

Yunhang Shen Rongrong Ji Yan Wang Yongjian Wu Liujuan Cao

Weakly supervised learning has attracted growing research attention due to the significant saving in annotation cost for tasks that require intra-image annotations, such as object detection and semantic segmentation. To this end, existing weakly segmentation approaches follow an iterative label mining model training pipeline. However, a self-enforcement pipeline makes both easy be trapped local minimums. In paper, we join with multi-task scheme first time, which uses their respective failure...

10.1109/cvpr.2019.00079 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Fast and Accurate Neural Word Segmentation for Chinese

OPENALEX - Publications

Deng Cai Hai Zhao Zhisong Zhang Xin Yuan Yongjian Wu and 1 more

Neural models with minimal feature engineering have achieved competitive performance against traditional methods for the task of Chinese word segmentation. However, both training and working procedures current neural are computationally inefficient. In this paper, we propose a greedy segmenter balanced character embedding inputs to alleviate existing drawbacks. Our is truly end-to-end, capable performing segmentation much faster even more accurate than state-of-the-art on benchmark datasets.

10.18653/v1/p17-2096 article EN cc-by 2017-01-01

Universal Adversarial Perturbation via Prior Driven Uncertainty Approximation

OPENALEX - Publications

Hong Liu Rongrong Ji Jie Li Baochang Zhang Yue Gao and 2 more

Deep learning models have shown their vulnerabilities to universal adversarial perturbations (UAP), which are quasi-imperceptible. Compared the conventional supervised UAPs that suffer from knowledge of training data, data-independent unsupervised more applicable. Existing methods fail take advantage model uncertainty produce robust perturbations. In this paper, we propose a new perturbation method, termed as Prior Driven Uncertainty Approximation (PD-UA), generate UAP by fully exploiting at...

10.1109/iccv.2019.00303 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2019-10-01

HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping

OPENALEX - Publications

Yuhan Wang Xu Chen Jun‐Wei Zhu Wenqing Chu Ying Tai and 5 more

In this work, we propose a high fidelity face swapping method, called HifiFace, which can well preserve the shape of source and generate photo-realistic results. Unlike other existing works that only use recognition model to keep identity similarity, 3D shape-aware control with geometric supervision from 3DMM reconstruction method. Meanwhile, introduce Semantic Facial Fusion module optimize combination encoder decoder features make adaptive blending, makes results more photo-realistic....

10.24963/ijcai.2021/157 article EN 2021-08-01

Carrying Out CNN Channel Pruning in a White Box

OPENALEX - Publications

Yuxin Zhang Mingbao Lin Chia‐Wen Lin Jie Chen Yongjian Wu and 2 more

Channel pruning has been long studied to compress convolutional neural networks (CNNs), which significantly reduces the overall computation. Prior works implement channel in an unexplainable manner, tends reduce final classification errors while failing consider internal influence of each channel. In this article, we conduct a white box. Through deep visualization feature maps activated by different channels, observe that channels have varying contribution categories image classification....

10.1109/tnnls.2022.3147269 article EN IEEE Transactions on Neural Networks and Learning Systems 2022-02-14

Dynamic Prototype Mask for Occluded Person Re-Identification

OPENALEX - Publications

Lei Tan Pingyang Dai Rongrong Ji Yongjian Wu

Although person re-identification has achieved an impressive improvement in recent years, the common occlusion case caused by different obstacles is still unsettled issue real application scenarios. Existing methods mainly address this employing body clues provided extra network to distinguish visible part. Nevertheless, inevitable domain gap between assistant model and ReID datasets highly increased difficulty obtain effective efficient model. To escape from pre-trained networks achieve...

10.1145/3503161.3547764 article EN Proceedings of the 30th ACM International Conference on Multimedia 2022-10-10

Towards Lightweight Transformer Via Group-Wise Transformation for Vision-and-Language Tasks

OPENALEX - Publications

Gen Luo Yiyi Zhou Xiaoshuai Sun Yan Wang Liujuan Cao and 3 more

Despite the exciting performance, Transformer is criticized for its excessive parameters and computation cost. However, compressing remains as an open problem due to internal complexity of layer designs, i.e., Multi-Head Attention (MHA) Feed-Forward Network (FFN). To address this issue, we introduce Group-wise Transformation towards a universal yet lightweight vision-and-language tasks, termed LW-Transformer1. LW-Transformer applies reduce both computations Transformer, while also preserving...

10.1109/tip.2021.3139234 article EN IEEE Transactions on Image Processing 2022-01-01

Occluded Person Re-identification via Saliency-Guided Patch Transfer

OPENALEX - Publications

Lei Tan Jiaer Xia Wenfeng Liu Pingyang Dai Yongjian Wu and 1 more

While generic person re-identification has made remarkable improvement in recent years, these methods are designed under the assumption that entire body of is available. This brings about a significant performance degradation when suffering from occlusion caused by various obstacles real-world applications. To address this issue, data-driven strategies have emerged to enhance model's robustness occlusion. Following random erasing paradigm, typically employ randomly generated noise supersede...

10.1609/aaai.v38i5.28312 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2024-03-24

GroupCap: Group-Based Image Captioning with Structured Relevance and Diversity Constraints

OPENALEX - Publications

Fuhai Chen Rongrong Ji Xiaoshuai Sun Yongjian Wu Jinsong Su

Most image captioning models focus on one-line (single image) captioning, where the correlations like relevance and diversity among group images (e.g., within same album or event) are simply neglected, resulting in less accurate diverse captions. Recent works mainly consider imposing during online inference only, which neglect correlation visual structures offline training. In this paper, we propose a novel group-based scheme (termed GroupCap), jointly structured towards an optimal...

10.1109/cvpr.2018.00146 article EN 2018-06-01

Towards Optimal Fine Grained Retrieval via Decorrelated Centralized Loss with Normalize-Scale Layer

OPENALEX - Publications

Xiawu Zheng Rongrong Ji Xiaoshuai Sun Baochang Zhang Yongjian Wu and 1 more

Recent advances on fine-grained image retrieval prefer learning convolutional neural network (CNN) with specific fullyconnect layer designed loss function for discriminative feature representation. Essentially, such should establish a robust metric to efficiently distinguish high-dimensional features within and outside categories. To this end, the existing functions are defected in two aspects: (a) The relationship is encoded inside training batch. Such local scope leads low accuracy. (b)...

10.1609/aaai.v33i01.33019291 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Supervised Matrix Factorization for Cross-Modality Hashing

OPENALEX - Publications

Hong Liu Rongrong Ji Yongjian Wu Gang Hua

Matrix factorization has been recently utilized for the task of multi-modal hashing cross-modality visual search, where basis functions are learned to map data from different modalities same Hamming embedding. In this paper, we propose a novel algorithm termed Supervised Factorization Hashing (SMFH) which tackles problem with collective non-matrix across modalities. particular, SMFH employs well-designed binary code learning preserve similarities among original features through graph...

10.48550/arxiv.1603.05572 preprint EN cc-by arXiv (Cornell University) 2016-01-01

Supervised Online Hashing via Hadamard Codebook Learning

OPENALEX - Publications

Mingbao Lin Rongrong Ji Hong Liu Yongjian Wu

In recent years, binary code learning, a.k.a. hashing, has received extensive attention in large-scale multimedia retrieval. It aims to encode high-dimensional data points into codes, hence the original metric space can be efficiently approximated via Hamming space. However, most existing hashing methods adopted offline batch which is not suitable handle incremental datasets with streaming or new instances. contrast, robustness of online remains as an open problem, while embedding...

10.1145/3240508.3240519 article EN Proceedings of the 30th ACM International Conference on Multimedia 2018-10-15

Rotated Binary Neural Network

OPENALEX - Publications

Mingbao Lin Rongrong Ji Zihan Xu Baochang Zhang Yan Wang and 3 more

Binary Neural Network (BNN) shows its predominance in reducing the complexity of deep neural networks. However, it suffers severe performance degradation. One major impediments is large quantization error between full-precision weight vector and binary vector. Previous works focus on compensating for norm gap while leaving angular bias hardly touched. In this paper, first time, we explore influence then introduce a Rotated (RBNN), which considers angle alignment binarized version. At...

10.48550/arxiv.2009.13055 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Centralized Ranking Loss with Weakly Supervised Localization for Fine-Grained Object Retrieval

OPENALEX - Publications

Xiawu Zheng Rongrong Ji Xiaoshuai Sun Yongjian Wu Feiyue Huang and 1 more

Fine-grained object retrieval has attracted extensive research focus recently. Its state-of-the-art schemesare typically based upon convolutional neural network (CNN) features. Despite the progress, two issues remain open. On one hand, deep features are coarsely extracted at image level rather than precisely level, which interrupted by background clutters. other training CNN with a standard triplet loss is time consuming and incapable to learn discriminative In this paper, we present novel...

10.24963/ijcai.2018/171 article EN 2018-07-01

Network Pruning Using Adaptive Exemplar Filters

OPENALEX - Publications

Mingbao Lin Rongrong Ji Shaojie Li Yan Wang Yongjian Wu and 2 more

Popular network pruning algorithms reduce redundant information by optimizing hand-crafted models, and may cause suboptimal performance long time in selecting filters. We innovatively introduce adaptive exemplar filters to simplify the algorithm design, resulting an automatic efficient approach called EPruner. Inspired face recognition community, we use a message-passing Affinity Propagation on weight matrices obtain number of exemplars, which then act as preserved EPruner breaks dependence...

10.1109/tnnls.2021.3084856 article EN IEEE Transactions on Neural Networks and Learning Systems 2021-06-09

Distilling a Powerful Student Model via Online Knowledge Distillation

OPENALEX - Publications

Shaojie Li Mingbao Lin Yan Wang Yongjian Wu Yonghong Tian and 2 more

Existing online knowledge distillation approaches either adopt the student with best performance or construct an ensemble model for better holistic performance. However, former strategy ignores other students' information, while latter increases computational complexity during deployment. In this article, we propose a novel method distillation, termed feature fusion and self-distillation (FFSD), which comprises two key components: FFSD, toward solving above problems in unified framework....

10.1109/tnnls.2022.3152732 article EN IEEE Transactions on Neural Networks and Learning Systems 2022-03-07

Coming Soon ...