Xiaohu Shao

ORCID: 0000-0003-1141-6020
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Face recognition and analysis
  • Face and Expression Recognition
  • Biometric Identification and Security
  • Human Pose and Action Recognition
  • Multimodal Machine Learning Applications
  • Video Surveillance and Tracking Methods
  • Advanced Image and Video Retrieval Techniques
  • 3D Shape Modeling and Analysis
  • Generative Adversarial Networks and Image Synthesis
  • Image and Video Quality Assessment
  • Advanced Image Processing Techniques
  • Domain Adaptation and Few-Shot Learning
  • Advanced X-ray and CT Imaging
  • Advanced Neural Network Applications
  • Topic Modeling
  • Image Retrieval and Classification Techniques
  • Radiomics and Machine Learning in Medical Imaging
  • Visual Attention and Saliency Detection
  • Handwritten Text Recognition Techniques
  • Digital Media Forensic Detection
  • Anomaly Detection Techniques and Applications
  • Diabetic Foot Ulcer Assessment and Management
  • Cancer-related molecular mechanisms research
  • Gait Recognition and Analysis

University of Chinese Academy of Sciences
2017-2023

Chongqing Institute of Green and Intelligent Technology
2016-2023

Chinese Academy of Sciences
2013-2021

Birkbeck, University of London
2021

University of Science and Technology of China
2011

Regression based facial landmark detection methods usually learns a series of regression functions to update the positions from an initial estimation. Most existing approaches focus on learning effective mapping with robust image features improve performance. The approach dealing initialization issue, however, receives relatively fewer attentions. In this paper, we present deep architecture two-stage re-initialization explicitly deal problem. At global stage, given rough face result, full...

10.1109/cvpr.2017.393 article EN 2017-07-01

We propose a straightforward method that simultaneously reconstructs the 3D facial structure and provides dense alignment. To achieve this, we design 2D representation called UV position map which records shape of complete face in space, then train simple Convolutional Neural Network to regress it from single image. also integrate weight mask into loss function during training improve performance network. Our does not rely on any prior model, can reconstruct full geometry along with semantic...

10.48550/arxiv.1803.07835 preprint EN public-domain arXiv (Cornell University) 2018-01-01

Deep convolutional neural networks (DCNN) have recently achieved state-of-the-art performance on handwritten Chinese character recognition (HCCR). However, most of DCNN models employ the softmax activation function and minimize cross-entropy loss, which may loss some inter-class information. To cope with this problem, we demonstrate a small but consistent advantage using both classification similarity ranking signals as supervision. Specifically, presented method learns model by maximizing...

10.1109/icfhr.2016.0099 article EN 2016-10-01

The quality of face images varies due to complex environmental factors, and with extremely poor qualities would deteriorate the performance recognition. As one pre-processing modules recognition, assessment needs consider both environment factors practical applications. In this paper, we propose a multibranch (MFQA) algorithm considering comprehensive acting as reliable reference for its following A light-weight convolution neural network (CNN) is used image feature extraction, scores...

10.1109/icct46805.2019.8947255 article EN 2019-10-01

This paper introduces our submission to the 2nd Facial Landmark Localisation Competition. We present a deep architecture directly detect facial landmarks without using face detection as an initialization. The consists of two stages, Basic Prediction Stage and Whole Regression Stage. At former stage, given input image, basic all faces are detected by sub-network landmark heatmap affinity field prediction. latter coarse canonical pose can be generated Pose Splitting Layer based on visible...

10.1109/cvprw.2017.258 article EN 2017-07-01

Numerous face frontalization methods based on 3D Morphable Model (3DMM) and Generative Adversarial Networks (GAN) have made great progress in multi-view recognition. However, facial feature analysis identity discrimination often suffer from failure results because of monotonous single-domain training unpredictable input profile faces. To overcome the drawback, we present a novel approach named Well-advised Pose Normalization Network (WAPNN), which leverages multiple domains extracts features...

10.1109/access.2020.2983459 article EN cc-by IEEE Access 2020-01-01

Single-view 3D human pose estimation (HPE) based on Graph Convolutional Networks (GCNs) currently suffers from problems such as insufficient spatial feature representation, difficult fusion of various information, and depth ambiguity in 2D to mapping. This paper proposes a framework for monocular learning spatio-temporal attention graph. Firstly, we build graph acquisition scheme obtain semantic with strong representativeness, by constructing global local through coarse-to-fine way. And then...

10.1109/icip46576.2022.9898019 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2022-10-16

This paper introduces our submission to the 2nd 3DFAW Challenge. To get a high-accuracy 3D dense face shape based on 2D videos or multiple images, framework which is consist of multi-reconstruction branches and mesh retrieval module, proposed effectively utilize information all frames results predicted by branches. The recent state-of-the-art methods single-view multi-view are introduced form an ensemble independent regression networks. candidate each branch synthesized weighted linear...

10.1109/iccvw.2019.00372 article EN 2019-10-01

Regression-based face alignment involves learning a series of mapping functions to predict the true landmark from an initial estimation alignment. Most existing approaches focus on efficacious some feature representations improve performance. The issues related and final objective, however, receive less attention. This work proposes deep regression architecture with progressive reinitialization new error-driven loss function explicitly address above two issues. Given image rough detection...

10.1109/tpami.2021.3073593 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2021-01-01

Video question answering (VideoQA) is challenging as it requires reasoning about natural language and multimodal interactive relations. Most existing methods apply attention mechanisms to extract interactions between the video or effective spatio-temporal relational representations. However, these neglect implication of relations intra- inter-modal for learning, they fail fully exploit synergistic effect multiscale semantics in answer reasoning. In this article, we propose a novel...

10.1145/3630101 article EN ACM Transactions on Multimedia Computing Communications and Applications 2023-10-25

Face frontalization has been widely used in face recognition to alleviate distribution discrepancy between multi-view faces. Given a profile face, existing models learn synthesize frontal from the whole region indistinguishably, often resulting unsatisfactory caused by lack of synthetic focus and disturbances trivial backgrounds. This paper proposes novel Deep Attention-based Frontalization (DAFF) method address above issues explicitly. We first inject 3D spatial prior input into an...

10.1109/icme51207.2021.9428396 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2021-06-09

Video question answering (VideoQA) challenges the joint learning of visual and linguistic knowledge. Whilst dynamic video-question interaction is not well explored in previous methods, search for answer clues from textual semantics valued. To address these issues, this paper proposes a novel Two-Stream Heterogeneous Graph Network (TSHGNet) using Dynamic Interactive Learning (DIL) to accomplish effective reasoning between videos questions. Inspired by way people questions, two-stream...

10.1109/ijcnn54540.2023.10191238 article EN 2022 International Joint Conference on Neural Networks (IJCNN) 2023-06-18

X-ray security check is one of the most effective measures widely used in airports, high-speed trains, subways, and other important places. Due to difference imaging mechanism between images visible images, contraband detection suffers from problems large intra-class differences small inter-class differences. The Softmax loss does not encourage compactness within class separability classes explicitly, resulting insensitivity target. We address this problem by proposing an attention based...

10.1109/icme55011.2023.00382 article EN 2022 IEEE International Conference on Multimedia and Expo (ICME) 2023-07-01

Deep Convolutional Networks (DCNs) have achieved great success in face detection. Most architectures of the DCN-based methods, however, suffer from multiple separated steps and large-size models, which increase training complexity also slow down testing speed. In this paper, we propose an efficient end-to-end architecture, called Hierarchical Bilinear Network (HBN), for fast accurate It mainly consists two parts: Backbone Network. The generates hierarchical feature maps efficiently...

10.1109/icip.2017.8296314 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2017-09-01

Semi-supervised visual domain adaptation is devoted to adapting a model learned in source target where there are only few labeled samples. In this paper, we propose semi-supervised cross-domain image recognition method which unifies the feature learning and training into convolutional neural network framework. Based on samples massive unlabeled domains, specially design three branches for class label, label similarity prediction simultaneously optimizes generate features that invariance...

10.1109/icip.2017.8296801 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2017-09-01

Facial landmark detection is a fundamental process included in many face analysis tasks. However, it faces great challenges for the facial images to be detected usually containing large variations, which will deteriorate precision. Among such partial occlusions and pose variations take effects. In this study, authors present mixture of discriminative visibility‐aware models (MDVMs) detection, improve generalisation ability model occlusion variation. By adopting different structure constrains...

10.1049/iet-ipr.2015.0699 article EN IET Image Processing 2016-05-09

In this paper, we propose a method for saliency detection based on Boosting algorithms in still images. Compared to detectors of pixel level based, detect salient regions an image sub-windows at any locations and sizes. For each window, compute set features including local contrast, gradient histogram contrast. We construct our detector cascade AdaBoost classifier get the which contain objects. Generally, more than one sub-window would through introduce score function remove redundant final...

10.1109/iccps.2011.6092280 article EN 2012 International Conference on Computational Problem-Solving (ICCP) 2011-10-01

In this paper, we present a pose invariant face recognition framework leveraged on 3D reconstruction and dynamic feature extraction.First, synthesize the virtual frontal from probe based reconstruction.In initialization of reconstruction, treestructured model is applied to detect landmark points 2D image hierarchical gaussianization (HG) method used for estimation.Second, in step, extraction improve recognizer, which measure similarity between synthesized face.Recognition experiments are...

10.2991/3ca-13.2013.30 article EN cc-by-nc Advances in intelligent systems research/Advances in Intelligent Systems Research 2013-01-01

Supervised deep learning models like convolutional neural network (CNN) have shown very promising results for the face recognition problem, which often require a huge number of labeled images. Since manually labeling large training set is difficult and time-consuming task, it beneficial if model can be trained from samples with only weak annotations. In this paper, we propose general framework to train CNN weakly facial images that are available on Internet. Specifically, first design...

10.1109/icip.2017.8296394 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2017-09-01
Coming Soon ...