- Multimodal Machine Learning Applications
- Advanced Image and Video Retrieval Techniques
- Domain Adaptation and Few-Shot Learning
- Human Pose and Action Recognition
- Millimeter-Wave Propagation and Modeling
- Remote-Sensing Image Classification
- Indoor and Outdoor Localization Technologies
- Anomaly Detection Techniques and Applications
- Ultra-Wideband Communications Technology
- Video Analysis and Summarization
- Remote Sensing and Land Use
- Advanced MIMO Systems Optimization
- Data Mining Algorithms and Applications
- Antenna Design and Analysis
- Video Surveillance and Tracking Methods
- Seismic Imaging and Inversion Techniques
- Generative Adversarial Networks and Image Synthesis
- Drilling and Well Engineering
- Visual Attention and Saliency Detection
- Advanced Chemical Sensor Technologies
- Gait Recognition and Analysis
- Rough Sets and Fuzzy Logic
- Human Motion and Animation
- Full-Duplex Wireless Communications
- Sparse and Compressive Sensing Techniques
China University of Petroleum, East China
2016-2025
Institute of Software
2023-2025
Beijing Jiaotong University
2024
University of Victoria
2021
China University of Petroleum, Beijing
2014
Ocean University of China
2011-2013
Huadong Hospital
2013
Max Planck Institute for Informatics
2011
Max Planck Society
2011
Feng Chia University
2008
The field of remote sensing (RS) image change detection (CD) has made significant progress, largely due to the powerful feature representation abilities deep learning. However, traditional methods have not fully exploited valuable information in differences. These often treat models as tools extract features from individual images, which limits their ability effectively describe Additionally, many approaches tend focus on spatial differences, while neglecting variations channel dimension. In...
Image and sentence matching has attracted increasing attention since it is associated with two important modalities of vision language. Previous methods aim to find the latent correspondences between image regions words by aggregating similarities region-word pairs. However, these approaches consider little about relationships diverse in treat all pairs equally. Moreover, focusing on fine-grained alignment overly, true meaning original will be likely distorted. In this paper, a novel Region...
Recently, prompt learning has emerged as a viable technique for fine-tuning pre-trained vision–language models (VLMs). The use of prompts allows VLMs to be quickly adapted specific downstream tasks, bypassing the necessity update original weights. Nevertheless, much existing work on focused primarily utilization non-specific prompts, with little attention paid category-specific data. In this paper, we present novel method, Category-Specific Prompt (CSP), which integrates task-oriented...
Precise object counting is crucial in practical applications, finding extensive utility across numerous societal domains. In the context of few-shot counting, variations angles can significantly alter distribution and distinguishability feature points, thereby increasing difficulty extraction. To address these challenges, a spatial channel similarity-aware attention-enhancement network for scenarios introduced. The employs slice convolution attention mechanisms within dimensions,...
Defocus deblurring is a challenging task in the fields of computer vision and image processing. The irregularity defocus blur kernels, coupled with limitations computational resources, poses significant difficulties for defocused restoration. Additionally, varying degrees across different regions impose higher demands on feature capture. Insufficient fine-grained extraction can result artifacts loss details, while inadequate coarse-grained cause distortion unnatural transitions. To address...
Sentiment analysis is rapidly advancing by utilizing various data modalities (e.g., text, video, and audio). However, most existing techniques only learn the atomic-level features that reflect strong correlations, while ignoring more complex compositions in multimodal data. Moreover, they also neglected incongruity semantic distribution among modalities. In light of this, we introduce a novel Hierarchical Correlation Modeling Network (HCMNet), which enhances sentiment exploring both...
With the development of deep neural networks, hyperpsectral image (HSI) classification systems have achieved a significant improvement. These require numerous and accurate labeled hyperspectral data to be adequately trained. However, noisy labels are inherent in real-world systems, resulting unreliable decisions. To handle classification, an end-to-end attentive-adaptive network (AAN) is proposed for robust HSI training. The goal build classifier with strong generalization capabilities that...
In order to make more effective use of Wi-Fi fingerprint data position an object, improved adaptive genetic algorithm (IAGA) is proposed optimize the BP (Back Propagation) neural network, namely, IAGA-BP. this method, selection, crossover and mutation operations are used weights biases network. On one hand, improves selection operator in on basis preserving optimal strategy. That is, population each generation will be sorted according adaptability from highest lowest, then 20% directly...
Hyperspectral anomaly detection (HAD) is crucial for identifying and analyzing abnormal objects in various domains. While existing methods have shown promising results by designing tailored to specific characteristics, there a need highly versatile approach that can effectively handle anomalies, particularly those with large spatial sizes. In this article, we propose an end-to-end corner-visible network (CVNet) unsupervised HAD. Specifically, introduce convolution leverages the statistical...
Image captioning with a natural language has been an emerging trend. However, the social image, associated set of user-contributed tags, rarely investigated for similar task. The which could reflect user attention, have neglected in conventional image captioning. Most existing models cannot be applied directly to In this work, dual attention model is proposed by combining visual and simultaneously.Visual used compress large mount salient information, while adjust description images tags....
Image caption based on reinforcement learning (RL) methods has achieved significant success recently. Most of these take CIDEr score as the reward algorithm to compute gradients, thus refining image baseline model. However, is not sole criterion judge quality a generated caption. In this paper, Hierarchical Attention Fusion (HAF) model presented for RL, where multi-level feature maps Resnet are integrated with hierarchical attention. Revaluation network (REN) exploited revaluating by...