Shangfei Wang

ORCID: 0000-0003-1164-9895
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Emotion and Mood Recognition
  • Face and Expression Recognition
  • Face recognition and analysis
  • Video Analysis and Summarization
  • Music and Audio Processing
  • Color perception and design
  • Image Retrieval and Classification Techniques
  • EEG and Brain-Computer Interfaces
  • Gaze Tracking and Assistive Technology
  • Generative Adversarial Networks and Image Synthesis
  • Biometric Identification and Security
  • Human Pose and Action Recognition
  • Text and Document Classification Technologies
  • Speech and Audio Processing
  • Advanced Image and Video Retrieval Techniques
  • Music Technology and Sound Studies
  • Anomaly Detection Techniques and Applications
  • Sentiment Analysis and Opinion Mining
  • Neuroscience and Music Perception
  • Hand Gesture Recognition Systems
  • Visual Attention and Saliency Detection
  • Spam and Phishing Detection
  • Image and Video Quality Assessment
  • Speech and dialogue systems
  • Video Surveillance and Tracking Methods

University of Science and Technology of China
2016-2025

National Taiwan University of Science and Technology
2022-2023

National Science Center
2022

Institute of Art
2022

China National Heavy Duty Truck Group (China)
2022

Northeastern University
2022

Wuhu Hit Robot Technology Research Institute
2021

Dalian University of Technology
2018-2019

Beijing Normal University
2011

Kyushu University
2005-2007

To date, most facial expression analysis has been based on visible and posed databases. Visible images, however, are easily affected by illumination variations, while expressions differ in appearance timing from natural ones. In this paper, we propose establish a infrared database, which contains both spontaneous of more than 100 subjects, recorded simultaneously an thermal camera, with provided three different directions. The database includes the apex expressional images without glasses....

10.1109/tmm.2010.2060716 article EN IEEE Transactions on Multimedia 2010-07-29

The tracking and recognition of facial activities from images or videos have attracted great attention in computer vision field. Facial are characterized by three levels. First, the bottom level, feature points around each component, i.e., eyebrow, mouth, etc., capture detailed face shape information. Second, middle action units, defined coding system, represent contraction a specific set muscles, lid tightener, eyebrow raiser, etc. Finally, top six prototypical expressions global muscle...

10.1109/tip.2013.2253477 article EN IEEE Transactions on Image Processing 2013-03-20

Video affective content analysis has been an active research area in recent decades, since emotion is important component the classification and retrieval of videos. can be divided into two approaches: direct implicit. Direct approaches infer videos directly from related audiovisual features. Implicit approaches, on other hand, detect based automatic a user's spontaneous response while consuming This paper first proposes general framework for video analysis, which includes content, emotional...

10.1109/taffc.2015.2432791 article EN IEEE Transactions on Affective Computing 2015-05-13

Spatial-temporal relations among facial muscles carry crucial information about expressions yet have not been thoroughly exploited. One contributing factor for this is the limited ability of current dynamic models in capturing complex spatial and temporal relations. Existing can only capture simple local sequential events, or lack incorporating uncertainties. To overcome these limitations take full advantage spatio-temporal information, we propose to model expression as a activity that...

10.1109/cvpr.2013.439 article EN 2009 IEEE Conference on Computer Vision and Pattern Recognition 2013-06-01

In this paper we tackle the problem of facial action unit (AU) recognition by exploiting complex semantic relationships among AUs, which carry crucial top-down information yet have not been thoroughly exploited. Towards goal, build a hierarchical model that combines bottom-level image features and top-level AU to jointly recognize AUs in principled manner. The proposed has two major advantages over existing methods. 1) Unlike methods can only capture local pair-wise dependencies, our is...

10.1109/iccv.2013.410 article EN 2013-12-01

Previous studies on facial expression analysis have been focused recognizing basic categories. There is limited amount of work the continuous intensity estimation, which important for detecting and tracking emotion change. Part reason lack labeled data with annotated since annotation requires expertise time consuming. In this work, we treat estimation as a regression problem. By taking advantage natural onset-apex-offset evolution pattern expression, proposed method can handle different...

10.1109/cvpr.2016.377 article EN 2016-06-01

In multi-label learning, each sample can be assigned to multiple class labels simultaneously. this work, we focus on the problem of learning with missing (MLML), where instead assuming a complete label assignment is provided for sample, only partial are values, while rest or not provided. The positive (presence), negative (absence) and explicitly distinguished in MLML. We formulate MLML as transductive problem, goal recover full by enforcing consistency available assignments smoothness...

10.1109/icpr.2014.343 article EN 2014-08-01

Existing facial expression recognition methods either focus on pose variations or identity bias, but not both simultaneously. This paper proposes an adversarial feature learning method to address of these issues. Specifically, the proposed consists five components: encoder, classifier, a discriminator, subject and generator. An encoder extracts representations, classifier tries perform using extracted representations. The are trained collaboratively, so that representations discriminative...

10.1145/3343031.3350872 article EN Proceedings of the 30th ACM International Conference on Multimedia 2019-10-15

In this paper, we propose a novel approach of occluded facial expression recognition under the help non-occluded images. The images are used as privileged information, which is only required during training, but not testing. Specifically, two deep neural networks first trained from and respectively. Then network fixed to guide fine-tuning both label space feature space. Similarity constraint loss inequality regularization imposed make output converge that network. Adversarial leaning adopted...

10.1145/3343031.3351049 article EN Proceedings of the 30th ACM International Conference on Multimedia 2019-10-15

We propose a multi-modal method with hierarchical recurrent neural structure to integrate vision, audio and text features for depression detection. Such contains two hierarchies of bidirectional long short term memories fuse predict the severity depression. An adaptive sample weighting mechanism is introduced adapt diversity training samples. Experiments on testing set detection challenge demonstrate effectiveness proposed method.

10.1145/3347320.3357696 article EN 2019-10-15

The wide popularity of digital photography and social networks has generated a rapidly growing volume multimedia data (i.e., image, music, video), resulting in great demand for managing, retrieving, understanding these data. Affective computing (AC) can help to understand human behaviors enable applications. In this article, we survey the state-of-the-art AC technologies comprehensively large-scale heterogeneous We begin by introducing typical emotion representation models from psychology...

10.1145/3363560 article EN ACM Transactions on Multimedia Computing Communications and Applications 2019-11-30

As one of the most important forms psychological behaviors, micro-expression can reveal real emotion. However, existing labeled samples are limited to train a high performance classifier. Since and macro-expression share some similarities in facial muscle movements texture changes, this paper we propose recognition framework that leverages as guidance. Specifically, first introduce two Expression-Identity Disentangle Network, named MicroNet MacroNet, feature extractor disentangle...

10.1145/3394171.3413774 article EN Proceedings of the 30th ACM International Conference on Multimedia 2020-10-12

Creating a large and natural facial expression database is prerequisite for analysis classification. It is, however, not only time consuming but also difficult to capture an adequately number of spontaneous images their meanings because no standard, uniform, exact measurements are available collection annotation. Thus, comprehensive first-hand data analyses may provide insight future research on construction, recognition, emotion inference. This paper presents our multimodal visible infrared...

10.1109/t-affc.2012.32 article EN IEEE Transactions on Affective Computing 2012-10-02

Current works on facial action unit (AU) recognition typically require fully AU-annotated images for supervised AU classifier training. annotation is a time-consuming, expensive, and error-prone process. While AUs are hard to annotate, expression relatively easy label. Furthermore, there exist strong probabilistic dependencies between expressions as well among AUs. Such referred domain knowledge. In this paper, we propose novel method that learns classifiers from knowledge...

10.1109/cvpr.2018.00233 article EN 2018-06-01

Visible facial images provide geometric and appearance patterns of expressions are sensitive to illumination changes. Thermal record temperature distribution robust light conditions. Therefore, expression recognition is enhanced by visible thermal image fusion. In most cases, only available due the widespread popularity cameras high cost cameras. Thus, we propose a novel method using infrared (IR) data as privileged information, which during training. Specifically, first learn deep model for...

10.1109/tcyb.2017.2786309 article EN IEEE Transactions on Cybernetics 2018-01-11

The inherent connections among aesthetic attributes and aesthetics are crucial for image assessment, but have not been thoroughly explored yet. In this paper, we propose a novel assessment assisted by through both representation-level label-level. used as privileged information, which is only required during training. Specifically, first multitask deep convolutional rating network to learn the score simultaneously. construct better feature representations multi-task learning. After that,...

10.1609/aaai.v33i01.3301679 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Facial action unit (AU) recognition is formulated as a supervised learning problem by recent works. However, the complex labeling process makes it challenging to provide AU annotations for large amounts of facial images. To remedy this, we utilize rules defined Action Coding System (FACS) design novel knowledge-driven self-supervised representation framework recognition. The encoder trained using images without annotations. are summarized from FACS partition manners and determine...

10.1109/cvpr52688.2022.01977 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2022-06-01

Facial expression recognition from thermal infrared images has attracted more and attentions in recent years. However, the features adopted current work are either temperature statistical parameters extracted facial regions of interest or several hand-crafted that commonly used visible spectrum. Till now there is no image specially defined for images. In this paper, we first to propose using Deep Boltzmann Machine learn long wavelength First, face located normalized Then, a model composed...

10.1109/acii.2013.46 article EN 2013-09-01

10.1016/j.patcog.2016.07.028 article EN publisher-specific-oa Pattern Recognition 2016-07-22

In this article, we propose a novel approach to recognize emotions with the help of privileged information, which is only available during training, but not testing. Such additional information can be exploited training construct better classifier. Specifically, audience's emotion from EEG signals stimulus videos, and tag videos' aid electroencephalogram (EEG) signals. First, frequency features are extracted audio/visual video stimulus. Second, selected by statistical tests. Third, new...

10.1109/tamd.2015.2463113 article EN IEEE Transactions on Autonomous Mental Development 2015-07-30
Coming Soon ...