- Video Surveillance and Tracking Methods
- Human Pose and Action Recognition
- Domain Adaptation and Few-Shot Learning
- Advanced Neural Network Applications
- Spectroscopy and Chemometric Analyses
- Multimodal Machine Learning Applications
- Gait Recognition and Analysis
- Face recognition and analysis
- Advanced Image and Video Retrieval Techniques
- Air Quality Monitoring and Forecasting
- Water Quality Monitoring and Analysis
- Advanced Image Processing Techniques
- Image Retrieval and Classification Techniques
- Air Quality and Health Impacts
- Machine Learning and ELM
- Fire Detection and Safety Systems
- Image Enhancement Techniques
- COVID-19 diagnosis using AI
- Visual perception and processing mechanisms
- Cancer-related molecular mechanisms research
- Advanced Algorithms and Applications
- Advanced Sensor and Control Systems
- Generative Adversarial Networks and Image Synthesis
- Face and Expression Recognition
- Image and Video Stabilization
Xi'an University of Science and Technology
2009-2024
Tianjin University
2024
Shenyang Institute of Computing Technology (China)
2024
University of Chinese Academy of Sciences
2013-2024
Beijing Academy of Artificial Intelligence
2020-2024
Shanghai Artificial Intelligence Laboratory
2024
Institute of Automation
2013-2023
Chinese Academy of Sciences
2013-2023
Center for Excellence in Brain Science and Intelligence Technology
2022
Institute of Applied Ecology
2021
Person Re-identification (ReID) is an important yet challenging task in computer vision. Due to the diverse background clutters, variations on viewpoints and body poses, it far from solved. How extract discriminative robust features invariant clutters core problem. In this paper, we first introduce binary segmentation masks construct synthetic RGB-Mask pairs as inputs, then design a mask-guided contrastive attention model (MGCAM) learn separately regions. Moreover, propose novel region-level...
Image and sentence matching has made great progress recently, but it remains challenging due to the large visual-semantic discrepancy. This mainly arises from that representation of pixel-level image usually lacks high-level semantic information as in its matched sentence. In this work, we propose a semantic-enhanced model, which can improve by learning concepts then organizing them correct order. Given an image, first use multi-regional multi-label CNN predict concepts, including objects,...
Semantic segmentation has achieved huge progress via adopting deep Fully Convolutional Networks (FCN). However, the performance of FCN based models severely rely on amounts pixel-level annotations which are expensive and time-consuming. To address this problem, it is a good choice to learn segment with weak supervision from bounding boxes. How make full use class-level region-level supervisions boxes critical challenge for weakly supervised learning task. In paper, we first introduce...
Image-level weakly-supervised semantic segmentation (WSSS) aims at learning by adopting only image class labels. Existing approaches generally rely on activation maps (CAM) to generate pseudo-masks and then train models. The main difficulty is that the CAM estimate covers partial foreground objects. In this paper, we argue critical factor preventing obtain full object mask classification boundary mismatch problem in applying WSSS. Because optimized task, it focuses discrimination across...
Weakly supervised semantic segmentation with only image-level labels saves large human effort to annotate pixel-level labels. Cutting-edge approaches rely on various innovative constraints and heuristic rules generate the masks for every single image. Although great progress has been achieved by these methods, they treat each image independently do not take account of relationships across different images. In this paper, however, we argue that cross-image relationship is vital weakly...
Existing works have designed end-to-end frameworks based on Faster-RCNN for person search. Due to the large receptive fields in deep networks, feature maps of each proposal, cropped from stem maps, involve redundant context information outside bounding boxes. However, search is a fine-grained task which needs accurate appearance information. Such can make model fail focus persons, so learned representations lack capacity discriminate various identities. To address this issue, we propose...
Person detection networks have been widely used in person search. These detectors discriminate persons from the background and generate proposals of all a gallery scene images for each query. However, such large number negative influence on following identity matching process because many distractors are involved. In this paper, we propose new network search, named Instance Guided Proposal Network (IGPN), which can learn similarity between query proposals. Thus, decrease according to scores....
Gait recognition plays a special role in visual surveillance due to its unique advantage, <i>e.g.</i>, long-distance, cross-view and non-cooperative recognition. However, it has not yet been widely applied. One reason for this awkwardness is the lack of truly big dataset captured practical outdoor scenarios. Here, “big” at least means: (1) huge amount gait videos, (2) sufficient subjects, (3) rich attributes, (4) spatial temporal variations. Moreover, most existing large-scale...
For unsupervised problems like clustering, linear or non-linear data transformations are widely used techniques. Generally, they beneficial to representation. However, if have a complicated structure, these techniques would be unsatisfy
Domain adaptation aims to alleviate the distribution discrepancy between source and target domains. Most conventional methods focus on one domain setting adapted from or multiple domains while neglecting multi-target setting. We argue that different also have complementary information, which is very important for performance improvement. In this paper, we propose an Attention-guided Multiple source-and-target Adaptation (AMDA) method capture context dependency information transferable...
Semantic segmentation has achieved huge progress via adopting deep Fully Convolutional Networks (FCN). However, the performance of FCN-based models severely rely on amounts pixel-level annotations which are expensive and time-consuming. Considering that bounding boxes also contain abundant semantic objective information, an intuitive solution is to learn with weak supervisions from boxes. How make full use class-level region-level estimate uncertain regions critical challenge for weakly...
Environmental air quality affects people's lives and has a profound guiding significance for the development of social activities. At present, environmental measurement mainly adopts method that setting detectors at specific monitoring points in cities with fix-time sampling slow analysis, which is severely restricted by time location. To address this problem, recognizing mobile cameras natural idea. Some algorithms related to deep learning mostly adopt single convolutional neural network...
Thanks to the advent of deep neural networks, recent years have witnessed rapid progress in person re-identification (re-ID). Deep-learning-based methods dominate leadership large-scale benchmarks, some which even surpass human-level performance. Despite their impressive performance under single-domain setup, current fully-supervised re-ID models degrade significantly when transplanted an unseen domain. According characteristics task, such degradation is mainly attributed dramatic variation...
Brain decoding aims to reconstruct visual perception of human subject from fMRI signals, which is crucial for understanding brain's mechanisms. Existing methods are confined the single-subject paradigm due substantial brain variability, leads weak generalization across individuals and incurs high training costs, exacerbated by limited availability data. To address these challenges, we propose MindAligner, an explicit functional alignment framework cross-subject The proposed MindAligner...
Elucidating the functional mechanisms of primary visual cortex (V1) remains a fundamental challenge in systems neuroscience. Current computational models face two critical limitations, namely cross-modal integration between partial neural recordings and complex stimuli, inherent variability characteristics across individuals, including differences neuron populations firing patterns. To address these challenges, we present multi-modal identifiable variational autoencoder (miVAE) that employs...
Roundworm parasite infections are a major cause of human and livestock disease worldwide threat to global food security. Disease control currently relies on anthelmintic drugs which roundworms becoming increasingly resistant. An alternative approach is by vaccination 'hidden antigens', components the worm gut not encountered infected host, have been exploited produce Barbervax, first commercial vaccine for dwelling nematode any host. Here we present structure H-gal-GP, hidden antigen from...