- Advanced Image and Video Retrieval Techniques
- Robotics and Sensor-Based Localization
- Advanced Neural Network Applications
- Advanced Vision and Imaging
- 3D Surveying and Cultural Heritage
- 3D Shape Modeling and Analysis
- Remote Sensing and LiDAR Applications
- Video Surveillance and Tracking Methods
- COVID-19 diagnosis using AI
- Domain Adaptation and Few-Shot Learning
- Multimodal Machine Learning Applications
- Human Pose and Action Recognition
- Face recognition and analysis
- Image Processing Techniques and Applications
- Human Motion and Animation
- Advanced Image Processing Techniques
- Remote-Sensing Image Classification
- Anomaly Detection Techniques and Applications
- Advanced Wireless Communication Technologies
- Face and Expression Recognition
- Energy Harvesting in Wireless Networks
- Computer Graphics and Visualization Techniques
- Digital Media Forensic Detection
- Image Enhancement Techniques
- Visual Attention and Saliency Detection
Beijing University of Posts and Telecommunications
2021-2025
Dali University
2025
Institute of Automation
2012-2022
Chinese Academy of Sciences
2013-2022
Anhui Science and Technology University
2022
Anhui University of Science and Technology
2022
University of Chinese Academy of Sciences
2018-2021
Shandong Institute of Automation
2015-2021
Beijing Academy of Artificial Intelligence
2020-2021
ORCID
2021
Accurate road detection and centerline extraction from very high resolution (VHR) remote sensing imagery are of central importance in a wide range applications. Due to the complex backgrounds occlusions trees cars, most methods bring heterogeneous segments; besides for task, current approaches fail extract wonderful network that appears smooth, complete, as well single-pixel width. To address above-mentioned issues, we propose novel deep model, i.e., cascaded end-to-end convolutional neural...
High spatial resolution (HSR) remote sensing images contain complex foreground-background relationships, which makes the land cover segmentation a special semantic task. The main challenges come from large-scale variation, background samples and imbalanced distribution. These issues make recent context modeling methods sub-optimal due to lack of foreground saliency modeling. To handle these problems, we propose Remote Sensing Segmentation framework (RSSFormer), including Adaptive TransFormer...
In this paper, we propose a robust framework for building extraction in visible band images. We first get an initial classification of the pixels based on unsupervised presegmentation. Then, develop novel conditional random field (CRF) formulation to achieve accurate rooftops extraction, which incorporates pixel-level information and segment-level identification rooftops. Comparing with commonly used CRF model, higher order potential defined segment is added our by exploiting region...
Efficiently training accurate deep models for weakly supervised semantic segmentation (WSSS) with image-level labels is challenging and important. Recently, end-to-end WSSS methods have become the focus of research due to their high efficiency. However, current suffer from insufficient extraction comprehensive information, resulting in low-quality pseudo-labels sub-optimal solutions WSSS. To this end, we propose a simple novel Self Correspondence Distillation (SCD) method refine without...
In this paper, we propose a novel two-step building extraction method from remote sensing images by integrating saliency cue. We first utilize classical features such as shadow, color, and shape to find out initial candidates. A fully connected conditional random field model is introduced in step ensure that most of the buildings are incorporated. While it hard further remove mislabled rooftops candidates only using features, adopt cue new feature determine whether there rooftop each...
Generating accurate pseudo-labels under the supervision of image categories is a crucial step in Weakly Supervised Semantic Segmentation (WSSS). In this work, we propose Mat-Label pipeline that provides fresh way to treat WSSS generation as an matting task. By taking trimap input which specifies foreground, background and unknown regions, task outputs object mask with fine edges. The intuition behind our generating much easier than directly weakly supervised setting. Although current...
The Class Activation Map (CAM) is widely used to generate pseudo-labels for Weakly Supervised Semantic Segmentation (WSSS), while it does not adequately consider the modeling of foreground-independent information, resulting in prone false positive pixels. In this paper, we propose a Wave-like (WaveCAM) from perspective representation fusion and dynamic aggregation alleviate above problem. Specifically, our WaveCAM includes foreground-aware that enhances perception foreground...
Local feature matching enjoys wide-ranging applications in the realm of computer vision, encompassing domains such as image retrieval, 3D reconstruction, and object recognition. However, challenges persist improving accuracy robustness due to factors like viewpoint lighting variations. In recent years, introduction deep learning models has sparked widespread exploration into local techniques. The objective this endeavor is furnish a comprehensive overview methods. These methods are...
In this paper, we describe a novel procedural modeling technique for generating realistic plant models from multi-view photographs. The realism is enhanced via visual and spatial information acquired images. contrast to previous approaches that heavily rely on user interaction segment plants or recover branches in images, our method automatically estimates an accurate depth map of each image extracts 3D dense point cloud by exploiting efficient stereophotogrammetry approach. Taking as soft...
Deep learning based classifiers on 3D point cloud data have been shown vulnerable to adversarial examples, while a defense strategy named Statistical Outlier Removal (SOR) is widely adopted defend examples successfully, by discarding outlier points in the cloud.
For a long time, the local descriptors learning benefited from use of L2 normalization, which projects descriptor space onto hypersphere. However, there is no free lunch in world. Although hypersphere description stabilizes optimization and improves repeatability descriptors, it causes to have denser distribution, reduces discrimination between leads some incorrect matches. To alleviate this problem, we propose learnable <italic xmlns:mml="http://www.w3.org/1998/Math/MathML"...
Limited by the locality of convolutional neural networks, most existing local features description methods only learn descriptors with information and lack awareness global surrounding spatial context. In this work, we focus on making ``look wider to describe better'' learning Descriptors More Than Local (MTLDesc). Specifically, resort context augmentation attention mechanism make obtain non-local awareness. First, Adaptive Global Context Augmented Module Diverse are proposed construct...
Recently, CLIP has found practical utility in the domain of pixel-level zero-shot segmentation tasks. The present landscape features two-stage methodologies beset by issues such as intricate pipelines and elevated computational costs. While current one-stage approaches alleviate these concerns incorporate Visual Prompt Training (VPT) to uphold CLIP's generalization capacity, they still fall short fully harnessing potential for unseen class demarcation precise pixel predictions. To further...
Gait recognition is a promising biometric method that aims to identify pedestrians from their unique walking patterns. Silhouette modality, renowned for its easy acquisition, simple structure, sparse representation, and convenient modeling, has been widely employed in controlled in-the-lab research. However, as gait rapidly advances in-the-wild scenarios, various conditions raise significant challenges silhouette including 1) unidentifiable low-quality silhouettes (abnormal segmentation,...
Accurate segmentation of colorectal polyps in colonoscopy images is crucial for effective diagnosis and management cancer (CRC). However, current deep learning-based methods primarily rely on fusing RGB information across multiple scales, leading to limitations accurately identifying due restricted domain challenges feature misalignment during multi-scale aggregation. To address these limitations, we propose the Polyp Segmentation Network with Shunted Transformer (PSTNet), a novel approach...
Abstract Surface topographical roughness plays a crucial role in enhancing biological activities by providing biomechanical stability, optimal osseointegration, and torsion resistance. However, the surface impacting on antibacterial cytotoxicity performances are still challenges to implant applications. This study investigates effect of roughening polyethylene terephthalate (PET) using sandpaper prior applying SiO-ZnO nanocomposite coating. Results show that increased from approximately 100...