- Advanced Image and Video Retrieval Techniques
- Visual Attention and Saliency Detection
- Advanced Neural Network Applications
- Advanced Vision and Imaging
- Image Processing Techniques and Applications
- Robotics and Sensor-Based Localization
- Video Surveillance and Tracking Methods
- Advanced Image Processing Techniques
- Industrial Vision Systems and Defect Detection
- Face recognition and analysis
- Image Enhancement Techniques
- Recommender Systems and Techniques
- Advanced Image Fusion Techniques
- Generative Adversarial Networks and Image Synthesis
- Image and Video Quality Assessment
- Advanced Graph Neural Networks
- Human Pose and Action Recognition
- Domain Adaptation and Few-Shot Learning
- Multimodal Machine Learning Applications
- Advanced Data Compression Techniques
- Image Retrieval and Classification Techniques
- Video Analysis and Summarization
- Infrared Target Detection Methodologies
- Remote-Sensing Image Classification
- Optical measurement and interference techniques
Hangzhou Dianzi University
2018-2025
Lishui University
2023-2025
Beihang University
2024
Shanghai Tenth People's Hospital
2021-2024
Tongji University
2021-2024
Shandong University
2024
Different from general face recognition, age-invariant recognition (AIFR) aims at matching faces with a big age gap. Previous discriminative methods usually focus on decomposing facial feature into age-related and components, which suffer the loss of identity information. In this article, we propose novel Multi-feature Fusion Decomposition (MFD) framework for learns more robust features reduces intra-class variants. Specifically, first sample multiple images different ages same as time...
Cross-view geo-localization is to spot images of the same geographic target from different platforms, e.g., drone-view cameras and satellites. It challenging in large visual appearance changes caused by extreme viewpoint variations. Existing methods usually concentrate on mining fine-grained feature image center, but underestimate contextual information neighbor areas. In this work, we argue that areas can be leveraged as auxiliary information, enriching discriminative clues for...
Recently, more and researchers have paid attention to the surface defect detection of strip steel. However, performance existing methods usually fails detect regions from some complex scenes, especially with noise disturbance diverse types. Therefore, this article proposes an end-to-end dense attention-guided cascaded network (DACNet) salient objects (i.e., defects) on steel surface, where proposed DACNet is a U-shape including encoder decoder. The first deploys multiresolution convolutional...
RGB-D saliency detection is receiving more and attention in recent years. There are many efforts have been devoted to this area, where most of them try integrate the multi-modal information, i.e. RGB images depth maps, via various fusion strategies. However, some ignore inherent difference between two modalities, which leads performance degradation when handling challenging scenes. Therefore, paper, we propose a novel model, namely Dynamic Selective Network (DSNet), perform salient object...
Salient object detection of surface defects is one the defect tasks, which aims at highlighting regions from strip steel, magnetic tale, road, and so on. However, performance existing methods degrades dramatically when dealing with complex scenarios, such as low contrast various shapes. Therefore, in this article, we propose a novel saliency model, namely, localizing, focus, refinement network (LFRNet), consists semantic-guided localizing module, context-driven focus edge-aware (ER) module....
Depth images and thermal contain the spatial geometry information surface temperature information, which can act as complementary for RGB modality.However, quality of depth is often unreliable in some challenging scenarios, will result performance degradation two-modal based salient object detection (SOD).Meanwhile, researchers pay attention to triple-modal SOD task, namely visibledepth-thermal (VDT) SOD, where they attempt explore complementarity image, image.However, existing methods fail...
Image demoireing is a multi-faceted image restoration task involving both moire pattern removal and color restoration. In this paper, we raise general degradation model to describe an contaminated by patterns, propose novel multi-scale bandpass convolutional neural network (MBCNN) for single demoireing. For removal, multi-block-size learnable filters (M-LBFs), based on block-wise frequency domain transform, learn the priors of patterns. We also introduce new loss function named Dilated...
Existing studies for gait recognition are dominated by in-the-lab scenarios. Since people live in real-world senses, the wild is a more practical problem that has recently attracted attention of community multimedia and computer vision. Current methods obtain state-of-the-art performance on benchmarks achieve much worse accuracy proposed in-the-wild datasets because these can hardly model varied temporal dynamics sequences unconstrained scenes. Therefore, this paper presents novel multi-hop...
We investigate the spectral and energy efficiencies of uplink in an integrated satellite-terrestrial cell-free massive multiple-input multiple-output (IST-CF-mMIMO) system assisted by rate-splitting multiple access (RSMA). In IST-CF-mMIMO system, terrestrial users employ RSMA to transmit a message as superposition two parts with different power points low-Earth-orbit satellite. Taking realistic conditions such spatially correlated Ricean fading channels, imperfect channel knowledge,...
This article discusses the limitations of single- and two-modal salient object detection (SOD) methods emergence multi-modal SOD techniques that integrate Visible, Depth, or Thermal information. However, current often rely on simple fusion such as addition, multiplication concatenation, to combine different modalities, which is ineffective for challenging scenes, low illumination background messy. To address this issue, we propose a novel feature network (MFFNet) V-D-T detection, where two...
Cross-view geo-localization aims to match images of the same target from different platforms, e.g., drone and satellite.It is a challenging task due changing appearance targets environmental content views.Most methods focus on obtaining more comprehensive information through feature map segmentation, while inevitably destroying image structure, are sensitive shifting scale in query.To address above issues, we introduce simple yet effective part-based representation learning, shifting-dense...
Salient object detection (SOD) can be applied to consumer electronic area, which help identify and locate objects of interest. RGB/RGB-D (depth) salient has achieved great progress in recent years. However, there is a large room for improvement exploring the complementarity two-modal information RGB-T (thermal) SOD. Therefore, this paper proposes Transformer-based Cross-modal Integration Network (i.e., TCINet) detect images, properly fuse features interactively aggregate two-level features....
The incorporation of automatic segmentation methodologies into dental X-ray images refined the paradigms clinical diagnostics and therapeutic planning by facilitating meticulous, pixel-level articulation both structures proximate tissues. This underpins pillars early pathological detection meticulous disease progression monitoring. Nonetheless, conventional frameworks often encounter significant setbacks attributable to intrinsic limitations imaging, including compromised image fidelity,...
Graph clustering is a fundamental task in data analysis and has attracted considerable attention recommendation systems, mapping knowledge domain, biological science. Because graph convolution very effective combining the feature information topology of data, some methods based on have achieved superior performance. However, current lack consideration structured process convolution. Specifically, most existing ignore implicit interaction between information, stacking small number...
Optical coherence tomography angiography (OCTA) offers critical insights into the retinal vascular system, yet its full potential is hindered by challenges in precise image segmentation. Current methodologies struggle with imaging artifacts and clarity issues, particularly under low-light conditions when using various high-speed CMOS sensors. These are pronounced diagnosing classifying diseases such as branch vein occlusion (BVO). To address these we have developed a novel network based on...
Cross-view geo-localization aims to match images of the same target from different platforms, e.g., drone and satellite. It is a challenging task due changing appearance targets environmental content views. Most methods focus on obtaining more comprehensive information through feature map segmentation, while inevitably destroying image structure, are sensitive shifting scale in query. To address above issues, we introduce simple yet effective part-based representation learning,...
Change captioning is an emerging task to describe the changes between a pair of images. The difficulty in this discover differences two Recently, some methods have been proposed address problem. However, they all employ unidirectional difference localization identify changes. This can lead ambiguity about nature Instead, we propose framework with bidirectional and semantic consistency reasoning image First, locate images by capturing differences. Then design decoder spatial-channel attention...
This paper focuses on the real-world automatic makeup problem. Given one non-makeup target image and reference image, is to generate face which maintains original identity with style in image. In scenario, task demands a robust system against environmental variants. The two main challenges could be summarized as follow: first, background images complicated. previous methods are prone change of well; second, foreground faces also easy affected. For instance, ``heavy'' may lose discriminative...