- Image Enhancement Techniques
- Advanced Vision and Imaging
- Generative Adversarial Networks and Image Synthesis
- Face recognition and analysis
- Image and Signal Denoising Methods
- Robotics and Sensor-Based Localization
- Advanced Image Fusion Techniques
- Autonomous Vehicle Technology and Safety
- Human Pose and Action Recognition
- Advanced Neural Network Applications
- Speech and Audio Processing
- Video Surveillance and Tracking Methods
- Topic Modeling
- Speech Recognition and Synthesis
- Natural Language Processing Techniques
- Face and Expression Recognition
- Video Analysis and Summarization
- Image Processing Techniques and Applications
- Adhesion, Friction, and Surface Interactions
- Surface Roughness and Optical Measurements
- Enhanced Oil Recovery Techniques
- Computational and Text Analysis Methods
- Computer Graphics and Visualization Techniques
- Biometric Identification and Security
- Precipitation Measurement and Analysis
Wuhan College
2025
Sanya University
2021-2025
Wuhan University of Technology
2011-2025
Shanghai Artificial Intelligence Laboratory
2023
Beijing Academy of Artificial Intelligence
2023
Institute of Porous Flow and Fluid Mechanics
2019
University of Chinese Academy of Sciences
2019
Research Institute of Petroleum Exploration and Development
2019
Abstract Rocks are heterogeneous multiscale porous media: two rock samples with identical bulk properties can vary widely in microstructure. The advent of digital technology and modern 3‐D printing provides new opportunities to replicate rocks. However, the inherent trade‐off between imaging resolution sample size limits scales over which microstructure macrostructure be identified related each other. Here, we develop a construction strategy by combining X‐ray computed microtomography...
Face anti-spoofing is becoming increasingly indispensable for face recognition systems, which are vulnerable to various spoofing attacks performed using fake photos and videos. In this paper, a novel "LDN-TOP representation followed by ProCRC classification" pipeline proposed. We use local directional number pattern (LDN) with the derivative-Gaussian mask capture detailed appearance information resisting illumination variations noises, can influence texture distribution. To further motion...
The multi-pose virtual try-on technology aims to seamlessly fit an in-shop garment onto a reference person in various poses. This has attracted considerable attention from researchers due its potential commercial and practical applications. Previous works this field have encountered issues such as unnatural alignment difficulty preserving the person's identity, arising weak mapping relationship between different feature crosses. To address these challenges, paper proposes novel network named...
Vision-based localization and mapping in the agricultural environment is challenging due to unstructured scene with unstable features, illumination variations, bumpy roads, dynamic environmental objects. To address these challenges, we propose an accurate robust stereo direct visual odometry system modifications on Stereo-DSO. We firstly select some well-matched static points latest keyframe improve accuracy of inverse depth calculation for tracking. The can further distinguish close objects...
Surround-view cameras combined with image depth transformation to 3D feature space and fusion point cloud features are highly regarded. The of 2D into by means predefined sampling points distribution happens throughout the scene, this process generates a large number redundant features. In addition, multimodal unified in often previous step downstream task, ignoring interactive between different scales. To end, we design new framework, focusing on that can give geometric perception...
LiDAR is a key sensor for accurately sensing of the environment in autonomous driving. While existing 3D object detection methods generally rely on data augmentation and feature fusion to improve performance, challenge dealing with sample imbalance often overlooked. We design novel network, IFNet, that tackles these issues by introducing mutually reinforcing enhancement strategies. It aims achieve dual purpose: 1) correcting category directly enhancing pedestrian samples using mixed...
This work aims to further compensate for the weaknesses of feature sparsity and insufficient discriminative acoustic features in existing short-duration speaker recognition. To address this issue, we propose Bark-scaled Gauss linear filter bank superposition cepstral coefficients (BGLCC), multidimensional central difference (MDCD) extracted method. The focuses on low-frequency information, while filtering is uniformly distributed, therefore, can obtain more richer audio signals. In addition,...
Image-based virtual try-on aims to transfer a target clothing onto specific person. A significant challenge is arbitrarily matched and person lack corresponding ground truth supervised learning. recent pioneering work leveraged an improved cycleGAN enable one network generate the desired image for another during training. However, there no difference in result distribution before after changes. Therefore, using two different networks unnecessary may even increase difficulty of convergence....
Abstract Face reenactment is a face image generation method. Its main task to generate new given source and driving image, which has the facial motion information of while retaining content image. Existing flow-based approaches have demonstrated high-quality results, but these works regard head movement as whole, cannot achieve more flexible control, often suffer from loss identity information. In this paper, we propose novel Controllable multi-identity reenactment(CFReenet), uses prior...
Outdoor videos sometimes contain unexpected rain streaks due to the rainy weather, which bring negative effects on subsequent computer vision applications, e.g., video surveillance, object recognition and tracking, etc. In this paper, we propose a directional regularized tensor-based deraining model by taking into consideration arbitrary direction of streaks. particular, sparsity in spatial derivative domains, spatiotemporal low-rank property background are incorporated proposed method....
Makeup transfer is not only to extract the makeup style of reference image, but also render semantic corresponding position target image. However, most existing methods focus on former and ignore latter, resulting in a failure achieve desired results. To solve above problems, we propose unified Symmetric Semantic-Aware Transformer (SSAT) network, which incorporates correspondence learning realize removal simultaneously. In SSAT, novel Semantic Corresponding Feature Transfer (SSCFT) module...
Single-image dehazing is an essential but challenging computer vision problem. Due to the lack of nonhomogeneous haze datasets, most existing image methods are only applicable homogeneous rather than tasks. In addition, results always blurred in detail. Thus, a novel network structure, Knowledge Transfer with Residual Dehazing Network, KTR2DN, proposed, which consists two parts: knowledge transfer and super-resolution using (R2) block. The former aims solve problem lacking datasets it...
Adverse weather conditions pose great challenges to computer vision tasks in the wild. Image de-weathering, which aims at removing degradations from videos and images, has hence accumulated huge popularity as a significant component of image restoration. Considering computational efficiency for on-device applications, Autoencoder-based deep models are widely adopted degradation removal due its excellent generalization high efficiency. However, most these models, parts high-frequency...
For autonomous driving vehicles, accurately predicting the future trajectories of interactive road agents and planning a trajectory that complies with societal requirements resembles human-like behavior is extremely important. Existing multi-vehicle prediction methods have redundancy when dealing multi-agent scenarios, is, they repeatedly encode invariant scenes around each vehicle, such as lane lines, which leads to increased delays in model's reasoning. To solve this problem, we propose...
Robust facial landmark localization remains a challenging task when faces are partially occluded. Recently, the cascaded pose regression has attracted increasing attentions, due to it's superior performance in and occlusion detection. However, such an approach is sensitive initialization, where improper initialization can severly degrade performance. In this paper, we propose Initialization for Cascaded Pose Regression (RICPR) by providing texture correlated initial shapes testing face. By...
Recently, stereo models based on coarse-to-fine approaches have drastically alleviated the memory footprint and speed limitations of complex network models. However, previous designs used a uniformly defined range for disparity estimation, which ignored difficulty pixel matching in different regions introduced many unnecessary candidates. In this paper, we construct Adaptive Thin Volume Network (ATVNet) to improve accuracy reduce computation time. Firstly, multi-scale feature maps are...