- Video Coding and Compression Technologies
- Advanced Data Compression Techniques
- Advanced Vision and Imaging
- Image and Video Quality Assessment
- Advanced Image and Video Retrieval Techniques
- Advanced Image Processing Techniques
- Visual Attention and Saliency Detection
- Image and Signal Denoising Methods
- Image Enhancement Techniques
- Advanced Image Fusion Techniques
- Image Retrieval and Classification Techniques
- Video Surveillance and Tracking Methods
- Advanced Neural Network Applications
- Video Analysis and Summarization
- Face recognition and analysis
- Medical Image Segmentation Techniques
- Domain Adaptation and Few-Shot Learning
- Multimodal Machine Learning Applications
- Image Processing Techniques and Applications
- Face and Expression Recognition
- Advanced Wireless Communication Techniques
- Multimedia Communication and Technology
- Digital Filter Design and Implementation
- Human Pose and Action Recognition
- Computer Graphics and Visualization Techniques
University of Electronic Science and Technology of China
2016-2025
Chinese University of Hong Kong
2013-2022
Australian National University
2019
University of Science and Technology of China
2019
University of Hong Kong
2008
Nanyang Technological University
2001-2005
National University of Singapore
1985-2005
The University of Western Australia
1995-2004
Applied Materials (United States)
1993-2003
Monash University
1991-2003
This paper addresses our proposed method to automatically segment out a person's face from given image that consists of head-and-shoulders view the person and complex background scene. The involves fast, reliable, effective algorithm exploits spatial distribution characteristics human skin color. A universal skin-color map is derived used on chrominance component input detect pixels with appearance. Then, based detected their corresponding luminance values, employs set novel regularization...
This paper proposes a generic model for unsupervised extraction of viewer's attention objects from color images. Without the full semantic understanding image content, formulates as Markov random field (MRF) by integrating computational visual mechanisms with object growing techniques. Furthermore, we describe MRF Gibbs an energy function. The minimization function provides practical way to obtain objects. Experimental results on 880 real images and user subjective evaluations 16 subjects...
In this paper, we introduce a method to detect co-saliency from an image pair that may have some objects in common. The is modeled as linear combination of the single-image saliency map (SISM) and multi-image (MISM). first term designed describe local attention, which computed by using three detection techniques available literature. To compute MISM, co-multilayer graph constructed dividing into spatial pyramid representation. Each node described two types visual descriptors, are extracted...
In image and video processing field, an effective compression algorithm should remove not only the statistical redundancy information but also perceptually insignificant component from pictures. Just-noticeable distortion (JND) profile is efficient model to represent those perceptual redundancies. Human eyes are usually sensitive below JND threshold. this paper, a DCT based for monochrome pictures proposed. This incorporates spatial contrast sensitivity function (CSF), luminance adaptation...
In the research field of image processing, mean squared error (MSE) and peak signal-to-noise ratio (PSNR) are extensively adopted as objective visual quality metrics, mainly because their simplicity for calculation optimization. However, it has been well recognized that these pixel-based difference measures correlate poorly with human perception. Inspired by existing works <citerefgrp xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><citeref...
The new video coding standard MPEG-4 is enabling content-based functionalities. It takes advantage of a prior decomposition sequences into object planes (VOPs) so that each VOP represents one moving object. A comprehensive review summarizes some the most important motion segmentation and generation techniques have been proposed. Then, automatic sequence algorithm extracts objects presented. core this an tracker matches two-dimensional (2-D) binary model against subsequent frames using...
Although IEEE 802.11 based wireless local area networks have become more and popular due to low cost easy deployment, they can only provide best effort services do not quality of service supports for multimedia applications. Recently, a new standard, 802.11e, has been proposed, which introduces so-called hybrid coordination function containing two medium access mechanisms: contention-based channel controlled access. In this article we first give brief tutorial on the various MAC-layer QoS...
In this paper, we propose an efficient blind image quality assessment (BIQA) algorithm, which is characterized by a new feature fusion scheme and k-nearest-neighbor (KNN)-based prediction model. Our goal to predict the perceptual of without any prior information its reference distortion type. Since inaccessible in many applications, BIQA quite desirable context. our method, first introduced combining image's statistical from multiple domains (i.e., discrete cosine transform, wavelet, spatial...
In this paper, a novel reduced-reference (RR) image quality assessment (IQA) is proposed by statistical modeling of the discrete cosine transform (DCT) coefficient distributions. order to reduce RR data rates and further exploit identical nature distributions between adjacent DCT subbands, coefficients are reorganized into three-level tree. Subsequently, generalized Gaussian density (GGD) employed model distribution each subband. The city-block distance measure difference two images....
Segmenting common objects that have variations in color, texture and shape is a challenging problem.In this paper, we propose new model efficiently segments from multiple images.We first segment each original image into number of local regions.Then, construct digraph based on region similarities saliency maps.Finally, formulate the co-segmentation problem as shortest path problem, use dynamic programming method to solve problem.The experimental results demonstrate proposed can group images...
This paper presents the result of a recent large-scale subjective study image retargeting quality on collection images generated by several representative methods. Owning to many approaches that have been developed, there is need for diverse independent public database retargeted and corresponding scores be freely available. We build an database, in which 171 (obtained from 57 natural source different contents) were created And perceptual each subjectively rated at least 30 viewers,...
In this paper, we propose an unsupervised salient object segmentation approach based on kernel density estimation (KDE) and two-phase graph cut. A set of KDE models are first constructed the pre-segmentation result input image, then for each pixel, a likelihoods to fit all calculated accordingly. The color saliency spatial model evaluated its distinctiveness distribution, pixel-wise map is generated by integrating likelihood measures pixels models. phase segmentation, cut exploited obtain...
In this paper, we propose a novel method to discover co-salient objects from group of images, which is modeled as linear fusion an intra-image saliency (IaIS) map and inter-image (IrIS) map. The first term measure the salient each image using multiscale segmentation voting. second designed detect images. To compute IrIS map, perform pairwise similarity ranking based on pyramid representation. A minimum spanning tree then constructed determine matching order. For region in image, design three...
We address the problem of recovering 3D geometry a human face from set facial images in multiple views. While recent studies have shown impressive progress Morphable Model (3DMM) based reconstruction, settings are mostly restricted to single view. There is an inherent drawback single-view setting: lack reliable constraints can cause unresolvable ambiguities. this paper explore 3DMM-based shape recovery different setting, where multi-view given as input. A novel approach proposed regress 3DMM...
Unmanned aerial vehicles are an essential component in the realization of Industry 4.0. With drones helping to improve industrial safety and efficiency utilities, construction, communication, there is urgent need for drone-based intelligent applications. In this paper, we develop a unified framework simultaneously detect count from drone images. We first explore why state-of-the-art detectors fail highly dense scenes, which provides more appropriate insights. Then, propose effective loss...
To provide multimedia applications with new functionalities, the video coding standard MPEG-4 relies on a content-based representation. This requires prior decomposition of sequences into semantically meaningful, physical objects. We formulate this problem as one separating foreground objects from background based motion information. For object interest, 2D binary model is derived and tracked throughout sequence. The points consist edge pixels detected by Canny operator. accommodate rotation...
This paper addresses our proposed method to automatically locate the person's face from a given image that consists of head-and-shoulders view person and complex background scene. The involves fast, simple yet robust algorithm exploits spatial distribution characteristics human skin color. It first uses chrominance component input detect pixels with color appearance. Then, bused on detected skin-color their corresponding luminance values, employs some regularization processes reinforce...
An adaptive cosine transform coding scheme for color images which incorporates human visual properties into the is described. It employs quantization to exploit statistical nature of coefficients and block distortion equalization reduce edge structures inherent in schemes. Results show that subjective quality reconstructed at a bit rate 0.4 bit/pixel or compression ratio 60:1 very good.< <ETX xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">></ETX>
Rate control plays an important role in the rapid development of high-fidelity video services. As High Efficiency Video Coding (HEVC) standard has been finalized, many rate algorithms are being developed to promote its commercial use. The HEVC encoder adopts a new R-lambda based model reduce bit estimation error. However, fails consider frame-content complexity that ultimately degrades performance control. In this letter, gradient (GRL) is proposed for intra frame control, where can...
Video quality fluctuation plays a significant role in human visual perception, and hence, many rate control approaches have been widely developed to maintain consistent for video communication. This paper presents novel framework based on the Lagrange multiplier high-efficiency coding. With assumption of constant control, new relationship between distortion is established. Based proposed model buffer status, we obtain computationally feasible solution problem minimizing variation across...
The emerging high efficiency video coding (HEVC) standard has improved compression performance significantly in comparison with H.264/AVC. However, more intensive computational complexity been introduced by adopting a number of new tools. In this paper, fast inter CU decision is proposed based on the latent sum absolute differences (SAD) estimation. Firstly, two-layer motion estimation (ME) method designed to take advantage SAD cost. ME can obtain costs for both upper and its sub-CUs....
Object detection is a significant and challenging problem in the study area of remote sensing image analysis. However, most existing methods are easy to miss or incorrectly locate objects due various sizes aspect ratios objects. In this paper, we propose novel end-to-end Adaptively Aspect Ratio Multi-Scale Network (A 2 RMNet) solve problem. On one hand, design multi-scale feature gate fusion network adaptively integrate features This composed modules, refine blocks region proposal networks....
In the field of objective image quality assessment (IQA), Spearman's $\rho$ and Kendall's $\tau$ are two most popular rank correlation indicators, which straightforwardly assign uniform weight to all levels assume each pair images sortable. They successful for measuring average accuracy an IQA metric in ranking multiple processed images. However, important perceptual properties ignored by them as well. Firstly, sorting (SA) high usually more than poor ones many real world applications, where...
Object proposals are used in two-stage detectors, such as R-CNN, to generate detection results, including category predictions and refined bounding-boxes. As a result, classification scores assigned bounding-boxes rather than object proposals. However, this procedure ignores the discrepancy of data distribution between We consider could limit accuracy. Specifically, foreground/background imbalance on inaccurate information from low-IoU hinder prediction. In paper, we propose detector called...