John See

ORCID: 0000-0003-3005-4109
Publications
Citations
Views
---
Saved
---
About
Contact & Profiles
Research Areas
  • Human Pose and Action Recognition
  • Face and Expression Recognition
  • Video Surveillance and Tracking Methods
  • Emotion and Mood Recognition
  • Advanced Image and Video Retrieval Techniques
  • Face recognition and analysis
  • Anomaly Detection Techniques and Applications
  • Advanced Neural Network Applications
  • Visual Attention and Saliency Detection
  • Generative Adversarial Networks and Image Synthesis
  • Multimodal Machine Learning Applications
  • Advanced Vision and Imaging
  • Hand Gesture Recognition Systems
  • Video Analysis and Summarization
  • Domain Adaptation and Few-Shot Learning
  • Image Retrieval and Classification Techniques
  • Aesthetic Perception and Analysis
  • Speech and Audio Processing
  • Robotics and Sensor-Based Localization
  • Digital Media Forensic Detection
  • Computer Graphics and Visualization Techniques
  • Industrial Vision Systems and Defect Detection
  • Diabetic Foot Ulcer Assessment and Management
  • Gaze Tracking and Assistive Technology
  • Advanced Image Processing Techniques

Heriot-Watt University Malaysia
2021-2024

Multimedia University
2012-2021

Heriot-Watt University
2021

ETH Zurich
2019

Yes Technologies (United States)
2019

Shanghai Jiao Tong University
2019

Rational (Germany)
1995

Facial micro-expression (ME) recognition has posed a huge challenge to researchers for its subtlety in motion and limited databases. Recently, handcrafted techniques have achieved superior performance but at the cost of domain specificity cumbersome parametric tunings. In this paper, we propose an Enriched Long-term Recurrent Convolutional Network (ELRCN) that first encodes each frame into feature vector through CNN module(s), then predicts by passing Long Short-term Memory (LSTM) module....

10.1109/fg.2018.00105 article EN 2018-05-01

Micro-expression recognition is still in the preliminary stage, owing much to numerous difficulties faced development of datasets. Since micro-expression an important affective clue for clinical diagnosis and deceit analysis, effort has gone into creation these datasets research purposes. There are currently two publicly available spontaneous datasets--SMIC CASME II, both with baseline results released using widely used dynamic texture descriptor LBP-TOP feature extraction. Although popular...

10.1371/journal.pone.0124674 article EN cc-by PLoS ONE 2015-05-19

One-stage object detectors are trained by optimizing classification-loss and localization-loss simultaneously, with the former suffering much from extreme foreground-background class imbalance issue due to large number of anchors. This paper alleviates this proposing a novel framework replace classification task in one-stage ranking task, adopting Average-Precision loss (AP-loss) for problem. Due its non-differentiability non-convexity, AP-loss cannot be optimized directly. For purpose, we...

10.1109/cvpr.2019.00526 article EN 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019-06-01

Automatic facial micro-expression (ME) analysis is a growing field of research that has gained much attention in the last five years. With many recent works testing on limited data, there need to spur better approaches are both robust and effective. This paper summarises 2nd Facial Micro-Expression Grand Challenge (MEGC 2019) held conjunction with 14th IEEE Conference Face Gesture Recognition (FG) 2019. In this workshop, we proposed challenges for two tasks- spotting recognition, aim...

10.1109/fg.2019.8756611 preprint EN 2019-05-01

Micro-expressions are spontaneous, brief and subtle facial muscle movements that exposes underlying emotions. Motivated by recent exploits into deep learning for micro-expression analysis, we propose a lightweight dual-stream shallow network in the form of pair truncated CNNs with heterogeneous input features. The merging convolutional features allows discriminative classes stemming from both streams. Using activation heatmaps, further demonstrate salient areas well emphasized, correspond...

10.1109/icip.2019.8802965 article EN 2022 IEEE International Conference on Image Processing (ICIP) 2019-08-26

Micro-expression usually occurs at high-stakes situations and may provide useful information in the field of behavioral psychology for better interpretion analysis. Unfortunately, it is technically challenging to detect recognize micro-expressions due its brief duration subtle facial distortions. Apex frame, which instant indicating most expressive emotional state a video, effective classify emotion that particular frame. In this work, we present novel method spot apex frame spontaneous...

10.1109/acpr.2015.7486586 article EN 2015-11-01

One-stage object detectors are trained by optimizing classification-loss and localization-loss simultaneously, with the former suffering much from extreme foreground-background class imbalance issue due to large number of anchors. This paper alleviates this proposing a novel framework replace classification task in one-stage ranking task, adopting Average-Precision loss (AP-loss) for problem. Due its non-differentiability non-convexity, AP-loss cannot be optimized directly. For purpose, we...

10.1109/tpami.2020.2991457 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2020-04-30

Few-shot action recognition aims to recognize novel classes (query) using just a few samples (support). The majority of current approaches follow the metric learning paradigm, which learns compare similarity between videos. Recently, it has been observed that directly measuring this is not ideal since different instances may show distinctive temporal distribution, resulting in severe misalignment issues across query and support In paper, we arrest problem from two distinct aspects --...

10.1609/aaai.v36i2.20029 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2022-06-28

Abstract This paper focuses on self-supervised video representation learning. Most existing approaches follow the contrastive learning pipeline to construct positive and negative pairs by sampling different clips. However, this formulation tends bias static background has difficulty establishing global temporal structures. The major reason is that pairs, i.e., clips sampled from same video, have limited receptive fields, usually share similar backgrounds but differ in motions. To address...

10.1007/s44267-023-00034-7 article EN cc-by Visual Intelligence 2024-01-10

In the recent year, state-of-the-art for facial micro-expression recognition have been significantly advanced by deep neural networks. The robustness of learning has yielded promising performance beyond that traditional handcrafted approaches. Most works in literature emphasized on increasing depth networks and employing highly complex objective functions to learn more features. this paper, we design a Shallow Triple Stream Three-dimensional CNN (STSTNet) is computationally light whilst...

10.1109/fg.2019.8756567 preprint EN 2019-05-01

Spontaneous subtle emotions are expressed through micro-expressions, which tiny, sudden and short-lived dynamics of facial muscles; thus poses a great challenge for visual recognition. The abrupt but significant the recognition task temporally sparse while rest, irrelevant dynamics, redundant. In this work, we analyze enforce sparsity constrains to learn temporal spectral structures eliminate would ease in spontaneous emotions. hypothesis is confirmed experimental results automatic emotion...

10.1109/taffc.2016.2523996 article EN IEEE Transactions on Affective Computing 2016-02-03

This paper addresses neural network based post-processing for the state-of-the-art video coding standard, High Efficiency Video Coding (HEVC). We first propose a partition-aware Convolution Neural Network (CNN) that utilizes partition information produced by encoder to assist in post-processing. In contrast existing CNN-based approaches, which only take decoded frame as input, proposed approach considers unit (CU) size and combines it with distorted such artifacts introduced HEVC are...

10.1109/tmm.2019.2962310 article EN IEEE Transactions on Multimedia 2019-12-25

Body shape is about proportion, and fashion style all dressing those proportions to look their very best. Figuring out the styles suit a body can be daunting task for many people. It is, therefore, essential develop framework learning compatibility of shapes clothing styles. Though designers stylists have analyzed correlation between human long time, this issue did not receive much attention in multimedia science. In paper, we present novel recommender, on basis user's attributes. The rich...

10.1109/tmm.2020.2980195 article EN IEEE Transactions on Multimedia 2020-03-12

Human-Object Interaction (HOI) detection recognizes how persons interact with objects, which is advantageous in autonomous systems such as self-driving vehicles and collaborative robots. However, current HOI detectors are often plagued by model inefficiency unreliability when making a prediction, consequently limits its potential for real-world scenarios. In this paper, we address these challenges proposing ERNet, an end-to-end trainable convolutional-transformer network detection. The...

10.1109/tip.2022.3231528 article EN IEEE Transactions on Image Processing 2023-01-01
Coming Soon ...