Wen Gao

ORCID: 0000-0001-8894-1806
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Video Coding and Compression Technologies
  • Image and Video Quality Assessment
  • Advanced Vision and Imaging
  • Face and Expression Recognition
  • Advanced Image Processing Techniques
  • Face recognition and analysis
  • Advanced Image and Video Retrieval Techniques
  • Advanced Data Compression Techniques
  • Image Retrieval and Classification Techniques
  • Advanced Image Fusion Techniques
  • Video Analysis and Summarization
  • Hand Gesture Recognition Systems
  • Visual Attention and Saliency Detection
  • Image and Signal Denoising Methods
  • Image Enhancement Techniques
  • Video Surveillance and Tracking Methods
  • Hearing Impairment and Communication
  • Biometric Identification and Security
  • Human Pose and Action Recognition
  • Image Processing Techniques and Applications
  • Gait Recognition and Analysis
  • Music and Audio Processing
  • Multimedia Communication and Technology
  • Robotics and Sensor-Based Localization
  • Multimodal Machine Learning Applications

Peking University
2016-2025

Peng Cheng Laboratory
2018-2025

Harbin Institute of Technology
2008-2024

Tencent (China)
2023

Polygon Physics (France)
2023

Central South University
2018-2022

Second Xiangya Hospital of Central South University
2022

Chinese Academy of Sciences
2004-2021

Changchun Institute of Optics, Fine Mechanics and Physics
2021

Hanyang University
2020

For years, researchers in face recognition area have been representing and recognizing faces based on subspace discriminant analysis or statistical learning. Nevertheless, these approaches are always suffering from the generalizability problem. This paper proposes a novel non-statistics representation approach, local Gabor binary pattern histogram sequence (LGBPHS), which training procedure is unnecessary to construct model, so that problem naturally avoided. In this image modeled as...

10.1109/iccv.2005.147 article EN 2005-01-01

Inspired by Weber's Law, this paper proposes a simple, yet very powerful and robust local descriptor, called the Weber Local Descriptor (WLD). It is based on fact that human perception of pattern depends not only change stimulus (such as sound, lighting) but also original intensity stimulus. Specifically, WLD consists two components: differential excitation orientation. The component function ratio between terms: One relative differences current pixel against its neighbors, other pixel....

10.1109/tpami.2009.155 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2009-08-21

In this paper, we address the problem of classifying image sets, each which contains images belonging to same class but covering large variations in, for instance, viewpoint and illumination. We innovatively formulate as computation Manifold-Manifold Distance (MMD), i.e., calculating distance between nonlinear manifolds representing one set. To compute MMD, also propose a novel manifold learning approach, expresses by collection local linear models, depicted subspace. MMD is then converted...

10.1109/cvpr.2008.4587719 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2008-06-01

Multiple kernel clustering (MKC) algorithms optimally combine a group of pre-specified base matrices to improve performance. However, existing MKC cannot efficiently address the situation where some rows and columns are absent. This paper proposes two simple yet effective this issue. Different from approaches incomplete first imputed standard algorithm is applied matrices, our integrates imputation into unified learning procedure. Specifically, we perform multiple directly with presence...

10.1109/tpami.2019.2892416 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2019-01-01

Low-light images are not conducive to human observation and computer vision algorithms due their low visibility. Although many image enhancement techniques have been proposed solve this problem, existing methods inevitably introduce contrast under- over-enhancement. Inspired by visual system, we design a multi-exposure fusion framework for low-light enhancement. Based on the framework, propose dual-exposure algorithm provide an accurate lightness Specifically, first weight matrix using...

10.48550/arxiv.1711.00591 preprint EN cc-by-nc-sa arXiv (Cornell University) 2017-01-01

High dynamic range (HDR) imaging techniques have been working constantly, actively, and validly in the fault detection disease diagnosis astronomical medical fields, currently they also gained much more attention from digital image processing computer vision communities. While HDR devices are starting to friendly prices, display still out of reach typical consumers. Due limited availability devices, most cases tone mapping operators (TMOs) used convert images standard low (LDR) for...

10.1109/tmm.2016.2518868 article EN IEEE Transactions on Multimedia 2016-01-18

In this paper, a novel united low-light image enhancement framework for both contrast and denoising is proposed. First, the segmented into superpixels, ratio between local standard deviation gradients utilized to estimate noise-texture level of each superpixel. Then inverted be processed in following steps. Based on level, smooth base layer adaptively extracted by BM3D filter, another detail first order differential smoothed with structural filter. These two layers are combined get...

10.1109/icip.2015.7351501 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2015-09-01

Recently, transformer has achieved remarkable performance on a variety of computer vision applications. Compared with mainstream convolutional neural networks, transformers are often sophisticated architectures for extracting powerful feature representations, which more difficult to be developed mobile devices. In this paper, we present an effective post-training quantization algorithm reducing the memory storage and computational costs transformers. Basically, task can regarded as finding...

10.48550/arxiv.2106.14156 preprint EN other-oa arXiv (Cornell University) 2021-01-01

Evaluations of the state-of-the-art both academic face recognition algorithms and commercial systems have shown that performance most current technologies degrades due to variations illumination. We investigate several illumination normalization methods propose some novel solutions. The main contribution includes: (1) A gamma intensity correction (GIC) method is proposed normalize overall image at given level; (2) region-based strategy combining GIC histogram equalization (HE) further...

10.1109/amfg.2003.1240838 article EN 2003-01-01

The variation of facial appearance due to the viewpoint (/pose) degrades face recognition systems considerably, which is one bottlenecks in recognition. One possible solutions generating virtual frontal view from any given nonfrontal obtain a gallery/probe face. Following this idea, paper proposes simple, but efficient, novel locally linear regression (LLR) method, generates image. We first justify basic assumption that there exists an approximate mapping between image and its counterpart....

10.1109/tip.2007.899195 article EN IEEE Transactions on Image Processing 2007-06-20

We propose a rate-distortion optimization (RDO) scheme based on the structural similarity (SSIM) index, which was found to be better indicator of perceived image quality than mean-squared error, but has not been fully exploited in context and video coding. At frame level, an adaptive Lagrange multiplier selection method is proposed novel reduced-reference statistical SSIM estimation algorithm rate model that combines side information with entropy transformed residuals. macroblock further...

10.1109/tcsvt.2011.2168269 article EN IEEE Transactions on Circuits and Systems for Video Technology 2011-09-22

Summary: Research in proteomics requires powerful database-searching software to automatically identify protein sequences a complex mixture via tandem mass spectrometry. In this paper, we describe novel system called pFind (peptide/protein Finder), which employs an effective peptide-scoring algorithm that reported earlier. The server is implemented with the C++ STL, .Net and XML technologies. As result, high speed good usability of are achieved.

10.1093/bioinformatics/bti439 article EN Bioinformatics 2005-04-07

For a typical video distribution system, the contents are first compressed and then stored in local storage or transmitted to end users through networks. While videos error-prone networks, error robustness becomes an important issue. In past years, number of rate-distortion (R-D) optimized coding mode selection schemes have been proposed for error-resilient coding, including recursive optimal per-pixel estimate (ROPE) method. However, ROPE-related approaches assume integer-pixel...

10.1109/tmm.2006.887989 article EN IEEE Transactions on Multimedia 2007-03-29

We propose a perceptual video coding framework based on the divisive normalization scheme, which is found to be an effective approach model sensitivity of biological vision, but has not been fully exploited in context coding. At macroblock (MB) level, we derive factors structural similarity (SSIM) index as attempt transform discrete cosine domain frame residuals perceptually uniform space. further develop MB level mode selection scheme and global quantization matrix optimization method....

10.1109/tip.2012.2231090 article EN IEEE Transactions on Image Processing 2012-12-04

Vehicle electrification is envisioned to be a significant component of the forthcoming smart grid. In this paper, grid vision electric vehicles for next 30 years and beyond presented from six perspectives pertinent intelligent transportation systems: 1) vehicles; 2) infrastructure; 3) travelers; 4) systems, operations, scenarios; 5) communications; 6) social, economic, political.

10.1109/tits.2014.2332472 article EN IEEE Transactions on Intelligent Transportation Systems 2014-07-31

The free-energy principle in recent studies of brain theory and neuroscience models the perception understanding outside scene as an active inference process, which tries to account for visual with internal generative model. Specifically, model, yields corresponding predictions its encountered scenes. Then, discrepancy between input prediction should be closely related quality perceptions. On other hand, sparse representation has been evidenced resemble strategy primary cortex representing...

10.1109/tmm.2017.2729020 article EN IEEE Transactions on Multimedia 2017-07-21

The block discrete cosine transform (BDCT) has been widely used in current image and video coding standards, owing to its good energy compaction decorrelation properties. However, because of independent quantization DCT coefficients each block, BDCT usually gives rise visually annoying blocking compression artifacts, especially at low bit rates. In this paper, reduce artifacts obtain high-quality images, deblocking is cast as an optimization problem within maximum a posteriori framework,...

10.1109/tcsvt.2016.2580399 article EN IEEE Transactions on Circuits and Systems for Video Technology 2016-06-13

Most existing blind image quality assessment (BIQA) methods belong to supervised methods, which always need a large number of samples and expensive subjective scores for training prediction model. In this paper, we focus our attention on the unsupervised BIQA put forward novel approach. The main idea method is quantify degradation through measuring structure, naturalness, perception variations distorted from pristine natural images. specific, structure variation captured by deviations phase...

10.1109/tcsvt.2019.2900472 article EN IEEE Transactions on Circuits and Systems for Video Technology 2019-02-21

In this paper, a Rate-GOP based frame level rate control scheme is proposed for High Efficiency Video Coding (HEVC). The developed with the consideration of new coding tools adopted into HEVC, including quad-tree structure and reference selection mechanism, called picture set (RPS). contributions paper mainly include following three aspects. Firstly, RPS hierarchical designed to maintain high video quality key frames. Secondly, inter-frame dependency distortion model bit are proposed,...

10.1109/jstsp.2013.2272240 article EN IEEE Journal of Selected Topics in Signal Processing 2013-07-03

The screen content images (SCIs) quality influences the user experience and interactive performance of remote computing systems. With numerous approaches proposed to evaluate natural images, much less work has been dedicated reduced-reference image assessment (RR-IQA) SCIs. Here, we propose an RR-IQA method from perspective SCI visual perception. In particular, distorted is evaluated by comparing a set extracted statistical features that consider both primary information unpredictable...

10.1109/tcsvt.2016.2602764 article EN IEEE Transactions on Circuits and Systems for Video Technology 2016-08-25

In this work, we propose a utility-driven preprocessing technique for high-efficiency screen content video (SCV) compression based on the temporal masking effect, which was found to be fundamental attribute that plays an important role in human visual perception of quality, but has not been fully exploited context SCV coding. Specifically, investigate effect from perspective perceived utility, allows us preserve quality high utility and substitute low regions with corresponding smooth...

10.1109/tmm.2016.2625276 article EN IEEE Transactions on Multimedia 2016-11-04
Coming Soon ...