- Quantum Information and Cryptography
- Quantum Computing Algorithms and Architecture
- Video Surveillance and Tracking Methods
- Quantum Mechanics and Applications
- Face and Expression Recognition
- Advanced Image and Video Retrieval Techniques
- Human Pose and Action Recognition
- Remote-Sensing Image Classification
- Multimodal Machine Learning Applications
- Image Retrieval and Classification Techniques
- Advanced Neural Network Applications
- Anomaly Detection Techniques and Applications
- Chemical Synthesis and Reactions
- Cryptography and Data Security
- Gait Recognition and Analysis
- Video Analysis and Summarization
- Domain Adaptation and Few-Shot Learning
- Face recognition and analysis
- Advanced Clustering Algorithms Research
- Text and Document Classification Technologies
- Metaheuristic Optimization Algorithms Research
- Image Processing Techniques and Applications
- Machine Learning and Data Classification
- Advanced Vision and Imaging
- Advanced Computing and Algorithms
University of Science and Technology of China
2024-2025
Guangzhou Metro Group (China)
2025
Zhengzhou University
2010-2024
University of Chinese Academy of Sciences
2024
Institute of Geographic Sciences and Natural Resources Research
2024
Chinese Academy of Sciences
2024
Shaanxi Normal University
2014-2024
Hebei University of Technology
2015-2024
State Key Laboratory of Nonferrous Metals and Processes
2024
Grinm Advanced Materials (China)
2024
Spectral clustering (SC) has been widely applied to various computer vision tasks, where the key is construct a robust affinity matrix for data partitioning. With increase in visual features, conventional SC methods are facing two challenges: 1) how effectively generate an based on multiple features? and 2) deal with high-dimensional features which could be redundant? To address these issues mentioned earlier, we present new approach to: learn using allowing us simultaneously determine...
Spectral clustering (SC) has been proven to be effective in various applications. However, the learning scheme of SC is suboptimal that it learns cluster indicator from a fixed graph structure, which usually requires rounding procedure further partition data. Also, obtained number cannot reflect ground truth connected components graph. To alleviate these drawbacks, we propose rank-constrained with flexible embedding framework. Specifically, an adaptive probabilistic neighborhood process...
Scene graph is a structured representation of scene that can clearly express the objects, attributes, and relationships between objects in scene. As computer vision technology continues to develop, people are no longer satisfied with simply detecting recognizing images; instead, look forward higher level understanding reasoning about visual scenes. For example, given an image, we want not only detect recognize but also know relationship (visual detection), generate text description (image...
Deep learning has made breakthroughs and substantial in many fields due to its powerful automatic representation capabilities. It been proven that neural architecture design is crucial the feature of data final performance. However, heavily relies on researchers' prior knowledge experience. And limitations human' inherent knowledge, it difficult for people jump out their original thinking paradigm an optimal model. Therefore, intuitive idea would be reduce human intervention as much possible...
In recent years, remarkable progress in zero-shot learning (ZSL) has been achieved by generative adversarial networks (GAN). To compensate for the lack of training samples ZSL, a surge GAN architectures have developed human experts through trial-and-error testing. Despite their efficacy, however, there is still no guarantee that these hand-crafted models can consistently achieve good performance across diversified datasets or scenarios. Accordingly, this paper, we turn to neural architecture...
Tracking in the unmanned aerial vehicle (UAV) scenarios is one of main components target-tracking tasks. Different from task general scenarios, UAV very challenging because factors such as small scale and view. Although discriminative correlation filter (DCF)-based tracker has achieved good results tracking tasks boundary effect caused by dense sampling method will reduce accuracy, especially UAV-tracking scenarios. In this work, we propose learning an adaptive spatial-temporal context-aware...
An integral part of video analysis and surveillance is temporal activity detection, which means to simultaneously recognize localize activities in long untrimmed videos. Currently, the most effective methods detection are based on deep learning, they typically perform very well with large scale annotated videos for training. However, these limited real applications due unavailable about certain classes time-consuming data annotation. To solve this challenging problem, we propose a novel task...
Object detection (OD) is a crucial computer vision task that has seen the development of many algorithms and models over years. While performance current OD improved, they have also become more complex, making them impractical for industry applications due to their large parameter size. To tackle this problem, knowledge distillation (KD) technology was proposed in 2015 image classification subsequently extended other visual tasks its ability transfer learned by complex teacher lightweight...
Linear discriminant analysis (LDA) is one of the most important supervised linear dimensional reduction techniques which seeks to learn low-dimensional representation from original high-dimensional feature space through a transformation matrix, while preserving discriminative information via maximizing between-class scatter matrix and minimizing within class matrix. However, conventional LDA formulated maximize arithmetic mean trace ratios suffers domination largest objectives might...
Unsupervised hashing can desirably support scalable content-based image retrieval for its appealing advantages of semantic label independence, memory, and search efficiency. However, the learned hash codes are embedded with limited discriminative semantics due to intrinsic limitation representation. To address problem, in this paper, we propose a novel approach, dubbed as discrete transfer (DSTH). The key idea is directly augment by exploring auxiliary contextual modalities. end, unified...
Linear discriminant analysis (LDA) is a popular technique to learn the most discriminative features for multi-class classification. A vast majority of existing LDA algorithms are prone be dominated by class with very large deviation from others, i.e., edge class, which occurs frequently in First, existence classes often makes total mean biased calculation between-class scatter matrix. Second, exploitation ℓ 2 -norm based distance criterion magnifies extremely corresponding class. In this...
Current dynamic networks and pruning methods have shown their promising capability in reducing theoretical computation complexity. However, sparse patterns on convolutional filters fail to achieve actual acceleration real-world implementation, due the extra burden of indexing, weight-copying, or zero-masking. Here, we explore a network slimming regime, named Dynamic Slimmable Network (DS-Net), which aims good hardware-efficiency via dynamically adjusting filter numbers at test time with...
Object detection is important in real-world applications. Existing methods mainly focus on object with sufficient labelled training data or zero-shot only concept names. In this paper, we address the challenging problem of natural language description, which aims to simultaneously detect and recognize novel instances textual descriptions. We propose a deep learning framework jointly learn visual units, visual-unit attention word-level attention, are combined achieve word-proposal affinity by...
Dynamic networks have shown their promising capability in reducing theoretical computation complexity by adapting architectures to the input during inference. However, practical runtime usually lags behind acceleration due inefficient sparsity. In this paper, we explore a hardware-efficient dynamic inference regime, named weight slicing, that can generalized well on multiple dimensions both CNNs and transformers (e.g. kernel size, embedding dimension, number of heads, etc.). Instead...
Video-based pedestrian reidentification is an emerging task in video surveillance and closely related to several real-world applications. Its goal match pedestrians across multiple nonoverlapping network cameras. Despite the recent effort, performance of needs further improvement. Hence, we propose a novel two-stream multirate recurrent neural for video-based with two inherent advantages: First, capturing static spatial temporal information; Second,Author: Figure II not cited text. Please...
Fuzzy K-Means (FKM) clustering is of great importance for analyzing unlabeled data. FKM algorithms assign each data point to multiple clusters with some degree certainty measured by the membership function. In these methods, fuzzy matrix obtained based on calculation distance between points in original space. However, this operation may lead suboptimal results because influence noises and redundant features. Besides, methods ignore weighting exponent. paper, we propose a novel method called...