- Human Pose and Action Recognition
- Video Surveillance and Tracking Methods
- Advanced Vision and Imaging
- Advanced Neural Network Applications
- Image Processing Techniques and Applications
- Anomaly Detection Techniques and Applications
- Multimodal Machine Learning Applications
- Gait Recognition and Analysis
- Advanced Image Processing Techniques
- Image Processing and 3D Reconstruction
- Privacy-Preserving Technologies in Data
- 3D Shape Modeling and Analysis
- Robotics and Sensor-Based Localization
- Generative Adversarial Networks and Image Synthesis
- Face recognition and analysis
- Hand Gesture Recognition Systems
- Adversarial Robustness in Machine Learning
- Immune Cell Function and Interaction
- Franchising Strategies and Performance
- Image and Video Stabilization
- ZnO doping and properties
- Digital and Cyber Forensics
- 3D Surveying and Cultural Heritage
- Elevator Systems and Control
- Digital Image Processing Techniques
Yanshan University
2021-2025
Hospital of Hebei Province
2025
Huazhong University of Science and Technology
2023-2024
Beijing University of Posts and Telecommunications
2021-2024
State Key Laboratory of Advanced Electromagnetic Engineering and Technology
2024
Guangzhou Experimental Station
2023
Inner Mongolia University
2023
Dalian Maritime University
2023
Northwestern Polytechnical University
2023
Shanghai University
2022
Skeleton-based human action recognition has received extensive attention due to its efficiency and robustness complex backgrounds. Though the skeleton can accurately capture dynamics of poses, it fails recognize actions induced by interaction between objects, making is great importance further explore objects for recognition. In this paper, we devise multi-stream networks (MSIN), simultaneously skeleton, objects. Specifically, apart from traditional stream, 1) second stream explores object...
Abstract Road scene parsing is a crucial capability for self-driving vehicles and intelligent road inspection systems. Recent research has increasingly focused on enhancing driving safety comfort by improving the detection of both drivable areas defects. This article reviews state-of-the-art networks developed over past decade general-purpose semantic segmentation specialized tasks. It also includes extensive experimental comparisons these across five public datasets. Additionally, we...
Federated Learning (FL) offers a decentralized approach to model training, where data remains local and only parameters are shared between the clients central server. Traditional methods, such as Averaging (FedAvg), linearly aggregate these which usually trained on heterogeneous distributions, potentially overlooking complex, high-dimensional nature of parameter space. This can result in degraded performance aggregated model. While personalized FL approaches mitigate issue some extent,...
Purpose The purpose of this study is to improve the accuracy and generalization ability intelligent fault diagnosis models for rolling bearings under varying operating conditions. By integrating multidimensional features through multi-view learning (MVL) utilizing Mamba feature fusion, method aims address challenge data distribution differences that reduce diagnostic when working conditions change. approach also incorporates domain adaptation techniques align source target data, ensuring...
Video depth estimation aims to infer temporally consistent depth. Some methods achieve temporal consistency by finetuning a single-image model during test time using geometry and re-projection constraints, which is inefficient not robust. An alternative approach learn how enforce from data, but this requires well-designed models sufficient video data. To address these challenges, we propose plug-and-play framework called Neural Depth Stabilizer (NVDS) that stabilizes inconsistent estimations...
Currently, an increasing number of model pruning methods are proposed to resolve the contradictions between computer powers required by deep learning models and resource-constrained devices. However, for simple tasks like robotic detection, most traditional rule-based network cannot reach a sufficient compression ratio with low accuracy loss time consuming as well laborious. In this article, we propose automatic blockwise channelwise (ABCP) jointly search action detection reinforcement...
Video depth estimation aims to infer temporally consistent depth. Some methods achieve temporal consistency by finetuning a single-image model during test time using geometry and re-projection constraints, which is inefficient not robust. An alternative approach learn how enforce from data, but this requires well-designed models sufficient video data. To address these challenges, we propose plug-and-play framework called Neural Depth Stabilizer (NVDS) that stabilizes inconsistent estimations...
Depth estimation aims to predict dense depth maps. In autonomous driving scenes, sparsity of annotations makes the task challenging. Supervised models produce concave objects due insufficient structural information. They overfit valid pixels and fail restore spatial structures. Self-supervised methods are proposed for problem. Their robustness is limited by pose estimation, leading erroneous results in natural scenes. this paper, we propose a supervised framework termed Diffusion-Augmented...
In this study, nitrogen-deficient graphitic carbon nitride (M-LS-g-C 3 N 4 ) with a mesoporous structure and large specific surface area was obtained by calcination after melt pretreatment using urea as precursor. X-ray diffraction (XRD), transmission electron microscopy (TEM), 2 adsorption, photoelectron spectroscopy (XPS), UV-Vis, ESR photoluminescence (PL) were used to characterize the structure, morphology optical performance of samples. The TEM results showed formation on 0.1[Formula:...
Monocular 3D object detection encounters occlusion problems in many application scenarios, such as traffic monitoring, pedestrian etc., which leads to serious false negative. Multi-view effectively solves this problem by combining data from different perspectives. However, due label confusion and feature confusion, the orientation estimation of multi-view is intractable, important for tracking intention prediction. In paper, we propose a novel method named MVM3Det simultaneously estimates...
As an important application based on the field of computer vision, various human movements and gestures can be reconstructed by detecting key articulation points body. It is mainly used in behavior recognition, human-computer interaction, attitude tracking. However, current pose estimation models have many challenges, such as difficulty non-typical body inaccurate locating extremities. They are prone to error or lack information complex situations. This paper proposed GAN DCGAN tackle this...
In the field of monocular depth estimation (MDE), many models with excellent zero-shot performance in general scenes emerge recently. However, these methods often fail predicting non-Lambertian surfaces, such as transparent or mirror (ToM) due to unique reflective properties regions. Previous utilize externally provided ToM masks and aim obtain correct maps through direct in-painting RGB images. These highly depend on accuracy additional input masks, use random colors during makes them...
Video depth estimation aims to infer temporally consistent depth. One approach is finetune a single-image model on each video with geometry constraints, which proves inefficient and lacks robustness. An alternative learning enforce consistency from data, requires well-designed models sufficient data. To address both challenges, we introduce NVDS
Federated Learning (FL) offers a decentralized approach to model training, where data remains local and only parameters are shared between the clients central server. Traditional methods, such as Averaging (FedAvg), linearly aggregate these which usually trained on heterogeneous distributions, potentially overlooking complex, high-dimensional nature of parameter space. This can result in degraded performance aggregated model. While personalized FL approaches mitigate issue some extent,...