- Domain Adaptation and Few-Shot Learning
- Video Surveillance and Tracking Methods
- Human Pose and Action Recognition
- Advanced Neural Network Applications
- Advanced Image and Video Retrieval Techniques
- Multimodal Machine Learning Applications
- Anomaly Detection Techniques and Applications
- Visual Attention and Saliency Detection
- Photonic and Optical Devices
- Sparse and Compressive Sensing Techniques
- Face and Expression Recognition
- Semiconductor Quantum Structures and Devices
- Image Enhancement Techniques
- COVID-19 diagnosis using AI
- Advanced Steganography and Watermarking Techniques
- Gait Recognition and Analysis
- Internet Traffic Analysis and Secure E-voting
- Industrial Vision Systems and Defect Detection
- Remote-Sensing Image Classification
- Machine Learning and ELM
- Network Security and Intrusion Detection
- Image Processing Techniques and Applications
- Robotics and Sensor-Based Localization
- Advanced Fiber Laser Technologies
- Image Retrieval and Classification Techniques
Nanjing University of Aeronautics and Astronautics
2009-2025
Fudan University
1998-2025
Huashan Hospital
2025
Fujian Institute of Research on the Structure of Matter
2025
Chinese Academy of Sciences
2021-2025
Xidian University
2012-2024
Xi'an Jiaotong University
2024
University of Chinese Academy of Sciences
2022-2024
Guangxi University
1990-2024
Guangdong Polytechnic Normal University
2024
Zero-shot learning (ZSL) aims to classify images from unseen categories, by merely utilizing seen class as the training data. Existing works on ZSL mainly leverage global features or learn regions, which, construct embeddings semantic space. However, few of them study discrimination power implied in local image regions (parts), some sense, correspond attributes, have stronger than and can thus assist transfer between seen/unseen classes. In this paper, discover (semantic) we propose...
Zero-shot learning (ZSL) is a challenging task due to the lack of unseen class data during training. Existing works attempt establish mapping between visual and spaces through common intermediate semantic space. The main limitation existing methods strong bias towards seen class, known as domain shift problem, which leads unsatisfactory performance in both conventional generalized ZSL tasks. To tackle this challenge, we propose convert supervised by generating features for classes. end,...
Vision transformers have recently shown strong global context modeling capabilities in camouflaged object detection. However, they suffer from two major limitations: less effective locality and insufficient feature aggregation decoders, which are not conducive to camou-flaged detection that explores subtle cues indistinguishable backgrounds. To address these issues, this paper, we propose a novel transformer-based Feature Shrinkage Pyramid Network (FSPNet), aims hierarchically decode...
Image-level weakly supervised semantic segmentation (WSSS) is a fundamental yet challenging computer vision task facilitating scene understanding and automatic driving. Most existing methods resort to classification-based Class Activation Maps (CAMs) play as the initial pseudo labels, which tend focus on discriminative image regions lack customized characteristics for task. To alleviate this issue, we propose novel activation modulation recalibration (AMR) scheme, leverages spotlight branch...
Recently, zero-shot action recognition (ZSAR) has emerged with the explosive growth of categories. In this paper, we explore ZSAR from a novel perspective by adopting Error-Correcting Output Codes (dubbed ZSECOC). Our ZSECOC equips conventional ECOC additional capability ZSAR, addressing domain shift problem. particular, learn discriminative for seen categories both category-level semantics and intrinsic data structures. This procedure deals implicitly transferring well-established...
In this paper, we propose an analysis mechanism-based structured discriminative dictionary learning (ADDL) framework. The ADDL seamlessly integrates learning, representation, and classifier training into a unified model. applied mechanism can make sure that the learned dictionaries, representations, linear classifiers over different classes are independent discriminating as much possible. is obtained by minimizing reconstruction error analytical incoherence promoting term encourages...
Both interclass variances and intraclass similarities are crucial for improving the classification performance of discriminative dictionary learning (DDL) algorithms. However, existing DDL methods often ignore combination between properties atoms coding coefficients. To address this problem, in paper, we propose a Fisher embedding (DFEDL) algorithm that simultaneously establishes models on learned Specifically, first construct atom model by exploring criterion atoms, which encourages same...
Video-based person re-identification (re-ID) is an important research topic in computer vision. The key to tackling the challenging task exploit both spatial and temporal clues video sequences. In this work, we propose a novel graph-based framework, namely Multi-Granular Hypergraph (MGH), pursue better representational capabilities by modeling spatiotemporal dependencies terms of multiple granularities. Specifically, hypergraphs with different granularities are constructed using various...
Conventional unsupervised hashing methods usually take advantage of similarity graphs, which are either pre-computed in the high-dimensional space or obtained from random anchor points. On one hand, existing uncouple procedures hash function learning and graph construction. other graphs empirically built upon original data could introduce biased prior knowledge relevance, leading to sub-optimal retrieval performance. In this paper, we tackle above problems by proposing an efficient adaptive...
In real life, group activity recognition plays a significant and fundamental role in variety of applications, e.g. sports video analysis, abnormal behavior detection, intelligent surveillance. complex dynamic scene, crucial yet challenging issue is how to better model the spatio-temporal contextual information inter-person relationship. this paper, we present novel attentive semantic recurrent neural network (RNN), namely, stagNet, for understanding activities individual actions videos, by...
The goal of few-shot learning is to learn a classifier that can recognize unseen classes from limited support data with labels. A common practice for this task train model on the base set first and then transfer novel through fine-tuning or meta-learning. However, as have no overlap set, simply transferring whole knowledge not an optimal solution since some in may be biased even harmful class. In paper, we propose partial by freezing particular layer(s) model. Specifically, layers will...
Person search aims to simultaneously localize and identify a query person from realistic, uncropped images, which can be regarded as the unified task of pedestrian detection re-identification (re-id). Most existing works employ two-stage detectors like Faster-RCNN, yielding encouraging accuracy but with high computational overhead. In this work, we present Feature-Aligned Search Network (AlignPS), first anchor-free framework efficiently tackle challenging task. AlignPS explicitly addresses...
Tactile sensors play very important role for robot perception in the dynamic or unknown environment. However, tactile object recognition exhibits great challenges practical scenarios. In this paper, we address problem by developing an extreme kernel sparse learning methodology. This method combines advantages of machine and simultaneously addressing dictionary classifier design problems. Furthermore, to tackle intrinsic difficulties which are introduced representer theorem, develop a reduced...
Weakly-supervised temporal action localization (WTAL) in untrimmed videos has emerged as a practical but challenging task since only video-level labels are available. Existing approaches typically leverage off-the-shelf segment-level features, which suffer from spatial incompleteness and incoherence, thus limiting their performance. In this paper, we tackle problem new perspective by enhancing representations with simple yet effective graph convolutional network, namely complement network...
Ulcerative colitis (UC) is a chronic gastrointestinal inflammatory disorder with rising prevalence. Due to the recurrent and difficult-to-treat nature of UC symptoms, current pharmacological treatments fail meet patients' expectations. This study presents machine learning-assisted high-throughput screening strategy expedite discovery efficient nanozymes for treatment. Therapeutic requirements, including antioxidant property, acid stability, zeta potential, are quantified predicted by using...
Numerous methods have been proposed for person re-identification, most of which however neglect the matching efficiency. Recently, several hashing based approaches developed to make re-identification more scalable large-scale gallery sets. Despite their efficiency, these works ignore cross-camera variations, severely deteriorate final accuracy. To address above issues, we propose a novel method fast namely Cross-camera Semantic Binary Transformation (CSBT). CSBT aims transform original...