- Advanced Image and Video Retrieval Techniques
- Domain Adaptation and Few-Shot Learning
- Advanced Neural Network Applications
- Visual Attention and Saliency Detection
- Multimodal Machine Learning Applications
- Image Retrieval and Classification Techniques
- Image Enhancement Techniques
- Video Surveillance and Tracking Methods
- Human Pose and Action Recognition
- Image and Video Quality Assessment
- Advanced Image Processing Techniques
- Medical Image Segmentation Techniques
- Advanced Vision and Imaging
- Advanced Image Fusion Techniques
- Anomaly Detection Techniques and Applications
- COVID-19 diagnosis using AI
- Video Analysis and Summarization
- Robotics and Sensor-Based Localization
- Machine Learning and ELM
- Image Processing Techniques and Applications
- Video Coding and Compression Technologies
- Remote-Sensing Image Classification
- Industrial Vision Systems and Defect Detection
- Context-Aware Activity Recognition Systems
- Remote Sensing and LiDAR Applications
University of Electronic Science and Technology of China
2016-2025
Capital Normal University
2017
Nanyang Technological University
2014
Xihua University
2008
The newly developed HEVC video coding standard can achieve higher compression performance than the previous standards, such as MPEG-4, H.263 and H.264/AVC. However, HEVC's high computational complexity raises concerns about burden on real-time application. In this paper, a fast pyramid motion divergence (PMD) based CU selection algorithm is presented for inter prediction. PMD features are calculated with estimated optical flow of downsampled frames. Theoretical analysis shows that be used to...
In this paper, we propose an efficient blind image quality assessment (BIQA) algorithm, which is characterized by a new feature fusion scheme and k-nearest-neighbor (KNN)-based prediction model. Our goal to predict the perceptual of without any prior information its reference distortion type. Since inaccessible in many applications, BIQA quite desirable context. our method, first introduced combining image's statistical from multiple domains (i.e., discrete cosine transform, wavelet, spatial...
Segmenting common objects that have variations in color, texture and shape is a challenging problem.In this paper, we propose new model efficiently segments from multiple images.We first segment each original image into number of local regions.Then, construct digraph based on region similarities saliency maps.Finally, formulate the co-segmentation problem as shortest path problem, use dynamic programming method to solve problem.The experimental results demonstrate proposed can group images...
Classifying texture images, especially those with significant rotation, illumination, scale, and viewpoint changes, is a fundamental challenging problem in computer vision. This paper proposes simple yet effective image descriptor, called Locally Encoded TRansform feature hISTogram (LETRIST), for classification. LETRIST histogram representation that explicitly encodes the joint information within an across scale spaces. The proposed training-free, low-dimensional, discriminative robust...
In this paper, we propose a novel method to discover co-salient objects from group of images, which is modeled as linear fusion an intra-image saliency (IaIS) map and inter-image (IrIS) map. The first term measure the salient each image using multiscale segmentation voting. second designed detect images. To compute IrIS map, perform pairwise similarity ranking based on pyramid representation. A minimum spanning tree then constructed determine matching order. For region in image, design three...
This letter presents a noise-robust descriptor by exploring set of local contrast patterns (LCPs) via global measures for texture classification. To handle image noise, the directed and undirected difference masks are designed to calculate three types intensity contrasts: directed, undirected, maximum responses. describe pixel-wise features, these responses separately quantized encoded into specific based on different measures. These resulting (i.e., LCPs) jointly form our final...
The emerging high efficiency video coding (HEVC) standard has improved compression performance significantly in comparison with H.264/AVC. However, more intensive computational complexity been introduced by adopting a number of new tools. In this paper, fast inter CU decision is proposed based on the latent sum absolute differences (SAD) estimation. Firstly, two-layer motion estimation (ME) method designed to take advantage SAD cost. ME can obtain costs for both upper and its sub-CUs....
Object detection is a significant and challenging problem in the study area of remote sensing image analysis. However, most existing methods are easy to miss or incorrectly locate objects due various sizes aspect ratios objects. In this paper, we propose novel end-to-end Adaptively Aspect Ratio Multi-Scale Network (A 2 RMNet) solve problem. On one hand, design multi-scale feature gate fusion network adaptively integrate features This composed modules, refine blocks region proposal networks....
In the field of objective image quality assessment (IQA), Spearman's $\rho$ and Kendall's $\tau$ are two most popular rank correlation indicators, which straightforwardly assign uniform weight to all levels assume each pair images sortable. They successful for measuring average accuracy an IQA metric in ranking multiple processed images. However, important perceptual properties ignored by them as well. Firstly, sorting (SA) high usually more than poor ones many real world applications, where...
Images acquired by outdoor vision systems easily suffer poor visibility and annoying interference due to the rainy weather, which brings great challenge for accurately understanding describing visual contents. Recent researches have devoted efforts on task of rain removal improving image visibility. However, there is very few exploration about quality assessment de-rained image, even it crucial measuring performance various de-raining algorithms. In this paper, we first create a (DQA)...
The newly developed High Efficiency Video Coding (HEVC) Standard has improved video coding performance significantly in comparison to its predecessors. However, more intensive computation complexity is introduced by implementing a number of new tools. In this paper, fast unit (CU) decision based on Markov random field (MRF) proposed for HEVC inter frames. First, it observed that the variance absolute difference (VAD) proportional with rate-distortion (R-D) cost. VAD feature designed CU...
Blind image quality assessment (BIQA) aims to estimate the subjective of a query without access reference image. Existing learning-based methods typically train regression function by minimizing average error between opinion scores and model predictions. However, does not necessarily lead correct rank-orders test images, which is highly desirable property models. In this paper, we propose novel rank-order regularized address problem. The key idea introduce pairwise constraint into maximum...
Head detection plays an important role in localizing and identifying persons from visual data. Most existing methods treat head as a specific form of object detection. is nontrivial due to the considerable difficulty building local global information under conditions unconstrained pose orientation. To address these issues, this paper presents effective adaptive relational network capture context information, which greatly helpful suppress missed We show that fundamental contextual...
Pixel-level segmentation has been widely used to improve object detection. Most of the existing methods refine detection features by adding constraint branch or simply embedding high-level into within local receptive field. However, noisy are unavoidable in real-word applications and can easily cause false positives. To address this problem, we propose a novel hierarchical context module effectively embed features. The idea is capture information that includes objects parts nonlocal learning...
In the field of computer vision, fine-grained image retrieval is an extremely challenging task due to inherently subtle intra-class object variations. addition, high-dimensional real-valued features extracted from large-scale datasets slow speed and increase storage cost. To solve above issues, existing methods mainly focus on finding more discriminative local regions for generating compact hash codes, which achieve limited performance large quantization errors confounding granularities...
Class-incremental semantic segmentation aims to incrementally learn new classes while maintaining the capability segment old ones, and suffers catastrophic forgetting since old-class labels are unavailable. Most existing methods based on convolutional networks prevent through knowledge distillation, which (1) need add additional layers predict classes, (2) ignore distinguish different regions corresponding during distillation roughly distill all features, thus limiting learning of classes....
Few-shot segmentation (FSS) aims to segment the novel class with a few annotated images. Due CLIP's advantages of aligning visual and textual information, integration CLIP can enhance generalization ability FSS model. However, even model, existing CLIP-based methods are still subject biased prediction towards base class, which is caused by class-specific feature level interactions. To solve this issue, we propose Prior Guided Mask Assemble Network (PGMA-Net). It employs class-agnostic mask...
Image retrieval with fine-grained categories is an extremely challenging task due to the high intraclass variance and low interclass variance. Most previous works have focused on localizing discriminative image regions in isolation, but rarely exploited correlations across different alleviate differences. In addition, compactness of embedding features ensured by extra regularization terms that only exist during training phase, which appear generalize less well inference phase. Finally,...