Yuanjie Shao

ORCID: 0000-0003-1141-0454
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Domain Adaptation and Few-Shot Learning
  • Multimodal Machine Learning Applications
  • Advanced Image Processing Techniques
  • Advanced Vision and Imaging
  • Advanced Neural Network Applications
  • Human Pose and Action Recognition
  • Anomaly Detection Techniques and Applications
  • Image Enhancement Techniques
  • Face and Expression Recognition
  • Video Surveillance and Tracking Methods
  • Remote-Sensing Image Classification
  • Advanced Image and Video Retrieval Techniques
  • Topic Modeling
  • Advanced Image Fusion Techniques
  • Face recognition and analysis
  • Infrared Target Detection Methodologies
  • Gait Recognition and Analysis
  • Image Retrieval and Classification Techniques
  • Machine Learning and ELM
  • Cancer-related molecular mechanisms research
  • Sparse and Compressive Sensing Techniques
  • Image Processing Techniques and Applications
  • Fire Detection and Safety Systems
  • Medical Image Segmentation Techniques
  • COVID-19 diagnosis using AI

Huazhong University of Science and Technology
2015-2024

Image dehazing using learning-based methods has achieved state-of-the-art performance in recent years. However, most existing train a model on synthetic hazy images, which are less able to generalize well real images due domain shift. To address this issue, we propose adaptation paradigm, consists of an image translation module and two modules. Specifically, first apply bidirectional network bridge the gap between domains by translating from one another. And then, use before after proposed...

10.1109/cvpr42600.2020.00288 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2020-06-01

Most recent approaches for online action detection tend to apply Recurrent Neural Network (RNN) capture long-range temporal structure. However, RNN suffers from non-parallelism and gradient vanishing, hence it is hard be optimized. In this paper, we propose a new encoder-decoder framework based on Transformers, named OadTR, tackle these problems. The encoder attached with task token aims the relationships global inter-actions between historical observations. decoder extracts auxiliary...

10.1109/iccv48922.2021.00747 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Self-supervised learning presents a remarkable performance to utilize unlabeled data for various video tasks. In this paper, we focus on applying the power of self-supervised methods improve semi-supervised action proposal generation. Particularly, design an effective Semi-supervised Temporal Action Proposal (SSTAP) framework. The SSTAP contains two crucial branches, i.e., temporal-aware branch and relation-aware branch. improves model by introducing temporal perturbations, feature shift...

10.1109/cvpr46437.2021.00194 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021-06-01

Recently, many approaches tackle the Unsupervised Domain Adaptive person re-identification (UDA re-ID) problem through pseudo-label-based contrastive learning. During training, a uni-centroid representation is obtained by simply averaging all instance features from cluster with same pseudo label. However, may contain images different identities (label noises) due to imperfect clustering results, which makes inappropriate. In this paper, we present novel Multi-Centroid Memory (MCM) adaptively...

10.1609/aaai.v36i3.20178 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

We propose a Generative Transfer Network (GTNet) for zero-shot object detection (ZSD). GTNet consists of an Object Detection Module and Knowledge Module. The can learn large-scale seen domain knowledge. leverages feature synthesizer to generate unseen class features, which are applied train new classification layer the In order synthesize features each with both intra-class variance IoU variance, we design IoU-Aware Adversarial (IoUGAN) as synthesizer, be easily integrated into GTNet....

10.1609/aaai.v34i07.6996 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Few-shot classification (FSC), which aims to identify novel classes in the presence of a few labeled samples, has drawn vast attention recent years. One representative few-shot methods is model-agnostic meta-learning (MAML), focuses on learning an initialization that can quickly adapt categories with annotated samples. However, due insufficient MAML easily fall into dilemma overfitting. Most existing MAML-based either improve inner-loop update rule achieve better generalization or constrain...

10.1109/tcsvt.2022.3232717 article EN IEEE Transactions on Circuits and Systems for Video Technology 2022-12-27

In web data, advertising images are crucial for capturing user attention and improving effectiveness. Most existing methods generate background products primarily focus on the aesthetic quality, which may fail to achieve satisfactory online performance. To address this limitation, we explore use of Multimodal Large Language Models (MLLMs) generating by optimizing Click-Through Rate (CTR) as primary objective. Firstly, build targeted pre-training tasks, leverage a large-scale e-commerce...

10.48550/arxiv.2502.06823 preprint EN arXiv (Cornell University) 2025-02-05

Generative methods have been successfully applied in zero-shot learning (ZSL) by an implicit mapping to alleviate the visual-semantic domain gaps and synthesizing unseen samples handle data imbalance between seen classes. However, existing generative simply use visual features extracted pre-trained CNN backbone. These lack attribute-level semantic information. Consequently, classes are indistinguishable, knowledge transfer from is limited. To tackle this issue, we propose a novel Semantic...

10.24963/ijcai.2022/134 article EN Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence 2022-07-01

The conventional text-based person re-identification methods heavily rely on identity annotations. However, this labeling process is costly and time-consuming. In paper, we consider a more practical setting called weakly supervised re-identification, where only the text-image pairs are available without requirement of annotating identities during training phase. To end, propose Cross-Modal Mutual Training (CMMT) framework. Specifically, to alleviate intra-class variations, clustering method...

10.1109/iccv48922.2021.01120 article EN 2021 IEEE/CVF International Conference on Computer Vision (ICCV) 2021-10-01

Deep learning-based methods for low-light image enhancement typically require enormous paired training data, which are impractical to capture in real-world scenarios. Recently, unsupervised approaches have been explored eliminate the reliance on data. However, they perform erratically diverse scenarios due absence of priors. To address this issue, we propose an method based effective prior termed histogram equalization (HEP). Our work is inspired by interesting observation that feature maps...

10.48550/arxiv.2112.01766 preprint EN other-oa arXiv (Cornell University) 2021-01-01

10.1109/cvpr52733.2024.02684 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024-06-16

The deep learning based methods have improved the visual tracking precision significantly. However, background distraction and high precise localization remain challenging problems. Despite that some fused shallow layer features to solve these problems, existing fusion methods, like simply concatenating or adding from different layers, cannot take advantage of both fully. In this paper, we propose a new adaptive feature method, called instance-based pyramid (IBFP) obtain discriminative...

10.1109/tcsvt.2021.3113041 article EN IEEE Transactions on Circuits and Systems for Video Technology 2021-09-15

Object detection as a subfield within computer vision has achieved remarkable progress, which aims to accurately identify and locate specific object from images or videos. Such methods rely on large-scale labeled training samples for each category ensure accurate detection, but obtaining extensive annotated data is labor-intensive expensive process in many real-world scenarios. To tackle this challenge, researchers have explored few-shot (FSOD) that combines learning techniques rapidly adapt...

10.2139/ssrn.4611614 preprint EN 2023-01-01

Image dehazing using learning-based methods has achieved state-of-the-art performance in recent years. However, most existing train a model on synthetic hazy images, which are less able to generalize well real images due domain shift. To address this issue, we propose adaptation paradigm, consists of an image translation module and two modules. Specifically, first apply bidirectional network bridge the gap between domains by translating from one another. And then, use before after proposed...

10.48550/arxiv.2005.04668 preprint EN other-oa arXiv (Cornell University) 2020-01-01

Image matching is widely used in visual-based navigation systems, most of which simply assume the ideal inputs without considering degradation real world, such as image blur. In presence situation, traditional methods first resort to restoration and then perform with restored image. However, by treating separately, accuracy will be reduced defective output restoration. this paper, we propose a joint method based on distance-weighted sparse representation (JRM-DSR), utilizes prior exploit...

10.1109/icpr.2018.8545413 article EN 2022 26th International Conference on Pattern Recognition (ICPR) 2018-08-01

Cross-domain few-shot classification (CD-FSC) aims to identify novel target classes with a few samples, assuming that there exists domain shift between source and domains. Existing state-of-the-art practices typically pre-train on then finetune the data yield task-adaptive representations. Despite promising progress, these methods are prone overfitting limited distribution since data-scarcity ignore transferable knowledge learned in domain. To alleviate this problem, we propose simple...

10.48550/arxiv.2308.00727 preprint EN other-oa arXiv (Cornell University) 2023-01-01
Coming Soon ...