- Human Pose and Action Recognition
- Video Surveillance and Tracking Methods
- Image Enhancement Techniques
- Anomaly Detection Techniques and Applications
- Multimodal Machine Learning Applications
- Emotion and Mood Recognition
- Advanced Image Fusion Techniques
- Robotics and Automated Systems
- Face recognition and analysis
- Advanced Vision and Imaging
- Gaze Tracking and Assistive Technology
- Autism Spectrum Disorder Research
- Advanced Decision-Making Techniques
- Advanced Image Processing Techniques
- Hand Gesture Recognition Systems
- Evaluation and Optimization Models
- Advanced Neural Network Applications
- Color Science and Applications
- Evaluation Methods in Various Fields
- Face and Expression Recognition
- Gait Recognition and Analysis
- Image and Signal Denoising Methods
- Information and Cyber Security
- Assistive Technology in Communication and Mobility
- Fire Detection and Safety Systems
Harbin Institute of Technology
2021-2025
State Key Laboratory of Robotics and Systems
2023-2024
Shenyang Institute of Automation
2016-2024
Chinese Academy of Sciences
2016-2024
First Affiliated Hospital of Henan University
2022-2023
Xingtai University
2022
City University of Hong Kong
2017-2020
University of Chinese Academy of Sciences
2016-2020
Ministry of Public Security of the People's Republic of China
2011-2016
China Information Technology Security Evaluation Center
2011-2012
The existing snow/rain removal methods often fail for heavy and dynamic scene. One reason the failure is due to assumption that all snowflakes/rain streaks are sparse in scenes. other can not differentiate moving objects streaks. In this paper, we propose a model based on matrix decomposition video desnowing deraining solve problems mentioned above. We divide into two categories: ones dense ones. With background fluctuations optical flow information, detection of formulated as multi-label...
State-of-the-art multi-object tracking (MOT) methods follow the tracking-by-detection paradigm, where object trajectories are obtained by associating per-frame outputs of detectors. In crowded scenes, however, detectors often fail to obtain accurate detections due heavy occlusions and high crowd density. this paper, we propose a new MOT tracking-by-counting, tailored for scenes. Using density maps, jointly model detection, counting, multiple targets as network flow program, which...
Most existing dehazing networks rely on synthetic hazy-clear image pairs for training, and thus fail to work well in real-world scenes. In this paper, we deduce a reformulated atmospheric scattering model hazy propose novel lightweight two-branch network. the model, use Transformation Map represent transformation Compensation variable illumination compensation. Based design <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">T</u> wo-...
Human-Object Interaction (HOI) detection aims to infer interactions between humans and objects, it is very important for scene analysis understanding. The existing methods usually focus on exploring instance-level (e.g., object appearance) or interaction-level action semantic) features conduct interaction prediction. However, most of these only consider the self-triplet feature aggregation, which may lead learning ambiguity without cross-triplet context exchange. In this paper, from both...
According to dichromatic reflection model, the previous methods of specular separation in image processing often separate from a single using patch-based priors. Due lack global information, these cannot completely component an and are incline degrade textures. In this paper, we derive color-lines constraint model effectively recover diffuse reflection. Our key observation is that each pixel lies along color line normalized RGB space different lines representing distinct chromaticities...
Most recently-proposed face completion algorithms use high-level features extracted from convolutional neural networks (CNNs) to recover semantic texture content. Although the completed is natural-looking, synthesized content still lacks lots of high-frequency details, since cannot supply sufficient spatial information for details recovery. To tackle this limitation, in paper, we propose a <underline xmlns:mml="http://www.w3.org/1998/Math/MathML"...
The geometric alterations in the iris's appearance are intricately linked to gaze direction. However, current deep appearance-based estimation methods mainly rely on latent feature sharing leverage iris features for improving representation learning, often neglecting explicit modeling of their relationships. To address this issue, paper revisits physiological structure eyeball and introduces a set assumptions, such as "the normal vector center approximates direction". Building these we...
Noisy labels make a profoundly negative impact on Facial Expression Recognition (FER) due to inter-class similarity and subjective annotation. Recent works mainly focus distinguishing the clean samples or mining latent truth, which not only need extra computational overhead, but also have error risk of relabeling. In this work, we propose SNEFER model without elaborately discriminating noisy labels, can adaptively stop effect by novel contrastive regularization term. Specifically, establish...
While visual tracking has been greatly improved over the recent years, crowd scenes remain particularly challenging for people due to heavy occlusions, high density, and significant appearance variation. To address these challenges, we first design a Sparse Kernelized Correlation Filter (S-KCF) suppress target response variations caused by occlusions illumination changes, spurious responses similar distractor objects. We then propose framework that fuses S-KCF map with an estimated density...
Gaze is a vital feature in analyzing natural human behavior and social interaction. Existing gaze target detection studies learn from orientations scene cues via neural network to model unconstrained scenes. Though achieve decent accuracy, these either employ complex architectures or leverage additional depth information, which limits the application. This article proposes simple effective that employs dual regression improve accuracy while maintaining low complexity. Specifically, training...
Falling snow not only blocks human vision, but also significantly degrades the effectiveness of computer vision systems in outdoor environment. In this paper, we aim to remove snowflakes videos by using global and local low-rank property snowflake-removed scenes. The stationary background mixture moving foreground as well falling snowflake are extracted via matrix decomposition. Some features, such its color size, used separate out from other objects. Then, mean absolute difference based...
Background BBIBP-CorV and CoronaVac inactivated COVID-19 vaccines are widely-used, World Health Organization-emergency-listed vaccines. Understanding antibody level changes over time after vaccination is important for booster dose policies. We evaluated neutralizing (nAb) titers associated factors the first 12 months primary-series with CoronaVac. Methods Our study consisted of a set cross-sectional sero-surveys in Zhejiang Shanxi provinces, China. In 2021, we enrolled 1,527 consenting...
Recent single image deraining methods either use a recurrent mechanism to gradually learn the mapping between clear images and rainy images, or focus on designing various loss functions supervise learning process. In this letter, we propose dually connected net using pixel-wise attention, for rain removal. Specifically, adopts an encoder-decoder as backbone, which can effectively residual rain-streaks map by jointly skip sum connection concatenation connection. The dual connections enable...
Human-Object Interaction (HOI) detection plays a vital role in scene understanding, which aims to predict the HOI triplet form of . Existing methods mainly extract multi-modal features (e.g., appearance, object semantics, human pose) and then fuse them together directly triplets. However, most these focus on seeking for self-triplet aggregation, but ignore potential cross-triplet dependencies, resulting ambiguity action prediction. In this work, we propose explore Self- Cross-Triplet...
Skeleton-based Temporal Action Segmentation involves the dense action classification of variable-length skeleton sequences. Current approaches primarily apply graph-based networks to extract framewise, whole-body-level motion representations, and use one-hot encoded labels for model optimization. However, whole-body representations do not capture fine-grained part-level neglect intrinsic semantic relationships within language-based definitions. To address these limitations, we propose a...
The spectral power distributions (SPD) of outdoor light sources are not constant over time and atmospheric conditions, which causes the appearance variation a scene common natural illumination phenomena, such as twilight, shadow, haze/fog. Calculating SPD at different (or zenith angles) under conditions is interest to physically-based vision. In this paper, for computer vision its applications, we propose feasible, simple, effective calculating method based on analyzing transmittance...
Facial Expression Recognition (FER) aims to identify emotional expressions in human faces, and it is a fundamental task computer vision. Recently, some methods apply Vision Transformer (ViT) FER have achieved promising results. However, still suffers from two key issues: inter-class similarity intra-class discrepancy. To address the issues, this letter, we propose Multi-Scale Attention Learning Network (MALN) based on ViT, which can learn facial expression embeddings multi-scale manner....