- Advanced Neural Network Applications
- Advanced Image and Video Retrieval Techniques
- Domain Adaptation and Few-Shot Learning
- Multimodal Machine Learning Applications
- Human Pose and Action Recognition
- Advanced Vision and Imaging
- Remote-Sensing Image Classification
- Cryptographic Implementations and Security
- Coding theory and cryptography
- Chaos-based Image/Signal Encryption
- Image Retrieval and Classification Techniques
- Visual Attention and Saliency Detection
- Machine Learning and Data Classification
- Advanced Image Processing Techniques
- Video Surveillance and Tracking Methods
- Advanced Image Fusion Techniques
- Neural Networks and Applications
- Image Enhancement Techniques
- Remote Sensing and Land Use
- Robotics and Sensor-Based Localization
- Advanced Multi-Objective Optimization Algorithms
- Advanced Steganography and Watermarking Techniques
- Generative Adversarial Networks and Image Synthesis
- Physical Unclonable Functions (PUFs) and Hardware Security
- Digital Media Forensic Detection
China Mobile (China)
2025
Nanjing University of Science and Technology
2021-2024
PLA Information Engineering University
2007-2024
Zhejiang Science and Technology Information Institute
2011-2024
Fudan University
2021-2024
Shanghai Center for Brain Science and Brain-Inspired Technology
2024
Shanghai Institute for Science of Science
2024
Xinjiang University
2023
Shangqiu Normal University
2010-2023
Wuyi University
2023
Few-shot semantic segmentation is the task of learning to locate each pixel novel class in query image with only a few annotated support images. The current correlation-based methods construct pair-wise feature correlations establish many-to-many matching because typical prototype-based approaches cannot learn fine-grained correspondence relations. However, existing still suffer from noise contained naive and lack context information correlations. To alleviate these problems mentioned above,...
Weakly supervised semantic segmentation with only image-level labels aims to reduce annotation costs for the task. Existing approaches generally leverage class activation maps (CAMs) locate object regions pseudo label generation. However, CAMs can discover most discriminative parts of objects, thus leading inferior pixel-level labels. To address this issue, we propose a saliency guided <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">I</b> nter-...
Multilabel image classification aims to assign images multiple possible labels. In this task, each may be associated with labels, making it more challenging than the single-label problems. For instance, convolutional neural networks (CNNs) have not met performance requirement in utilizing statistical dependencies between labels study. Additionally, data imbalance is a common problem machine learning that needs considered for multilabel medical classification. Furthermore, concatenation of...
Weakly supervised semantic segmentation (WSSS) models relying on class activation maps (CAMs) have achieved desirable performance comparing to the non-CAMs-based counterparts. However, guarantee WSSS task feasible, we need generate pseudo labels by expanding seeds from CAMs which is complex and time-consuming, thus hindering design of efficient end-to-end (single-stage) approaches. To tackle above dilemma, resort off-the-shelf readily accessible saliency for directly obtaining given...
One-shot semantic image segmentation aims to segment the object regions for novel class with only one annotated image. Recent works adopt episodic training strategy mimic expected situation at testing time. However, these existing approaches simulate test conditions too strictly during process, and thus cannot make full use of given label information. Besides, mainly focus on foreground-background target setting. They utilize binary mask labels training. In this paper, we propose leverage...
The image-level label has prevailed in weakly supervised semantic segmentation tasks due to its easy availability. Since labels can only indicate the existence or absence of specific categories objects, visualization-based techniques have been widely adopted provide object location clues. Considering class activation maps (CAMs) locate most discriminative part recent approaches usually adopt an expansion strategy enlarge area for more integral localization. However, without proper...
There has been significant attention devoted to the effectiveness of various domains, such as semi-supervised learning, contrastive and meta-learning, in enhancing performance methods for noisy label learning (NLL) tasks. However, most existing still depend on prior assumptions regarding clean samples amidst different sources noise (e.g., a pre-defined drop rate or small subset samples). In this paper, we propose simple yet powerful idea called NPN, which revolutionizes Noisy by integrating...
In this paper investigations are conducted to simplify and refine a vision-model-based video quality metric without compromising its prediction accuracy. Unlike other metrics, the proposed is parameterized using subjective assessment data recently provided by Video Quality Experts Group. The able generate perceptual distortion map for each every frame. A blocking (PBDM) introduced which utilizes simplified metric. PBDM formulated based on observation that artifacts noticeable only in certain...
Unsupervised domain adaptation for semantic segmentation aims to transfer knowledge from a labeled source another unlabeled target domain. However, due the label noise and mismatch, learning directly data tends have poor performance. Though adversarial methods strive reduce discrepancies by aligning feature distributions, traditional suffer training imbalance distortion problems. Besides, absence of labels, classifier is blind features during training. Consequently, final overfits usually...
The early detection and grading of gliomas is important for treatment decision assessment prognosis. Over the last decade numerous automated computer analysis tools have been proposed, which can potentially lead to more reliable reproducible brain tumor diagnostic procedures. In this paper, we used gradient-based features extracted from structural magnetic resonance imaging (sMRI) images depict subtle changes within brains patients with gliomas. Based on gradient features, proposed a novel...
Vehicle detection and vehicle viewpoint estimation are both crucial for assistive autonomous driving systems. In this paper, we propose a soft discriminative mixture of (SDMoV) models joint estimation. The proposed SDMoV model is learned in two steps. First, viewpoint-specific component model, which aims to maximize classification accuracy, each cluster images with similar viewpoint. Second, new margin objective function, designed retrain these into models. capable detecting vehicles...
This paper reports the background and results of Automated Object Recognition in Optical Remote Sensing Imagery, which is one tracks 2022 International Algorithm Case Competition, as well summarize challenges, champion solutions, future directions.
The choice of treatment and prognosis evaluation depend on the accurate early diagnosis brain tumors. Many tumors go undiagnosed or are overlooked by clinicians as a result challenges associated with manually evaluating magnetic resonance imaging (MRI) images in clinical practice. In this study, we built computer-aided (CAD) system for glioma detection, grading, segmentation, knowledge discovery based artificial intelligence algorithms. Neuroimages specifically represented using type visual...
Few-shot video object segmentation (FSVOS) aims to segment dynamic objects of unseen classes by resorting a small set support images that contain pixel-level annotations. Existing methods have demonstrated the domain agent-based attention mechanism is effective in FSVOS learning correlation between and query frames. However, agent frame contains redundant pixel information background noise, resulting inferior performance. Moreover, existing tend ignore inter-frame correlations videos. To...
Abstract Large language models (LLMs), such as ChatGPT, have demonstrated impressive capabilities in various tasks and attracted increasing interest a natural interface across many domains. Recently, large vision-language (VLMs) that learn rich vision–language correlation from image–text pairs, like BLIP-2 GPT-4, been intensively investigated. However, despite these developments, the application of LLMs VLMs image quality assessment (IQA), particularly medical imaging, remains unexplored....
Image matting is widely studied for accurate foreground extraction. Most algorithms, including deep-learning based solutions, require a carefully edited trimap. Recent works attempt to combine the segmentation stage and in one CNN model, but errors occurring at lead unsatisfactory matte. We propose user-guided approach practical human matting. More precisely, we provide good automatic initial natural way of interaction that reduces workload drawing trimaps allows users guide ambiguous...
In vehicle retrieval, the patch should first be localized to remove irrelevant background information. Moreover, negative samples are much more prevalent than positive samples, and information from is not fully exploited in triple loss. What we need a way incorporate global knowledge structure address these two issues. Therefore, introduce local-global context network for landmark alignment update predicted results by using semantic local compatibility propose structure-aware quadruple loss...
Traditional learning algorithms use only labeled data for training. However, examples are often difficult or time consuming to obtain since they require substantial human labeling efforts. On the other hand, unlabeled relatively easy collect. Semisupervised addresses this problem by using large quantities of with build better algorithms. In paper, we manifold regularization approach formulate semisupervised where a framework which balances tradeoff between loss and penalty is established. We...