- Domain Adaptation and Few-Shot Learning
- Multimodal Machine Learning Applications
- Human Pose and Action Recognition
- Advanced Image and Video Retrieval Techniques
- Advanced Neural Network Applications
- Retinal Imaging and Analysis
- Anomaly Detection Techniques and Applications
- Glaucoma and retinal disorders
- Advanced Vision and Imaging
- Retinal Diseases and Treatments
- Image Retrieval and Classification Techniques
- Video Analysis and Summarization
- COVID-19 diagnosis using AI
- Video Surveillance and Tracking Methods
- Generative Adversarial Networks and Image Synthesis
- Recommender Systems and Techniques
- Topic Modeling
- Advanced Image Processing Techniques
- Digital Imaging for Blood Diseases
- Text and Document Classification Technologies
- AI in cancer detection
- Optical Coherence Tomography Applications
- Machine Learning and ELM
- Cancer-related molecular mechanisms research
- Gait Recognition and Analysis
University of Electronic Science and Technology of China
2017-2025
Shenzhen Institutes of Advanced Technology
2023-2024
Institute for Advanced Study
2022-2023
Chinese Academy of Medical Sciences & Peking Union Medical College
2021-2023
Sichuan Academy of Medical Sciences & Sichuan Provincial People's Hospital
2021-2023
Guangzhou University of Chinese Medicine
2022
Ministry of Education of the People's Republic of China
2022
Beijing Institute of Technology
2018
Amazon (United States)
2015-2016
Seattle University
2016
Cross-domain learning methods have shown promising results by leveraging labeled patterns from the auxiliary domain to learn a robust classifier for target which has only limited number of samples. To cope with considerable change between feature distributions different domains, we propose new cross-domain kernel framework into many existing can be readily incorporated. Our framework, referred as Domain Transfer Multiple Kernel Learning (DTMKL), simultaneously learns function and minimizing...
In this paper, we study the heterogeneous domain adaptation (HDA) problem, in which data from source and target are represented by features with different dimensions. By introducing two projection matrices, first transform domains into a common subspace such that similarity between samples across can be measured. We then propose new feature mapping function for each domain, augments transformed their original zeros. Existing supervised learning methods (e.g., SVM SVR) readily employed...
We propose a visual event recognition framework for consumer videos by leveraging large amount of loosely labeled web (e.g., from YouTube). Observing that generally contain intraclass variations within the same type events, we first new method, called Aligned Space-Time Pyramid Matching (ASTPM), to measure distance between any two video clips. Second, transfer learning referred as Adaptive Multiple Kernel Learning (A-MKL), in order 1) fuse information multiple pyramid levels and features...
In this paper, we propose a new framework called domain adaptation machine (DAM) for the multiple source adaption problem. Under framework, learn robust decision function (referred to as target classifier) label prediction of instances from by leveraging set base classifiers which are prelearned using labeled either domains or and domain. With classifiers, domain-dependent regularizer based on smoothness assumption, enforces that classifier shares similar values with relevant unlabeled This...
We propose a multiple source domain adaptation method, referred to as Domain Adaptation Machine (DAM), learn robust decision function (referred target classifier) for label prediction of patterns from the by leveraging set pre-computed classifiers auxiliary/source classifiers) independently learned with labeled domains. introduce new data-dependent regularizer based on smoothness assumption into Least-Squares SVM (LS-SVM), which enforces that classifier shares similar values auxiliary...
Cross-domain learning methods have shown promising results by leveraging labeled patterns from auxiliary domains to learn a robust classifier for target domain, which has limited number of samples. To cope with the tremendous change feature distribution between different in video concept detection, we propose new cross-domain kernel method. Our method, referred as Domain Transfer SVM (DTSVM), simultaneously learns function and minimizing both structural risk functional mismatch unlabeled...
Cross-domain object detection is challenging, because model often vulnerable to data variance, especially the considerable domain shift between two distinctive domains. In this paper, we propose a new Unbiased Mean Teacher (UMT) for cross-domain detection. We reveal that there exists bias simple mean teacher (MT) in scenarios, and eliminate with several yet highly effective strategies. particular, model, distillation method MT maximally exploit expertise of model. Moreover, student alleviate...
We propose a new approach, called self-motivated pyramid curriculum domain adaptation (PyCDA), to facilitate the of semantic segmentation neural networks from synthetic source domains real target domains. Our approach draws on an insight connecting two existing works: and self-training. Inspired by former, PyCDA constructs which contains various properties about domain. Those are mainly desired label distributions over images, image regions, pixels. By enforcing network observe those...
This paper presents an anomaly detection method that is based on a sparse coding inspired Deep Neural Networks (DNN). Specifically, in light of the success detection, we propose Temporally-coherent Sparse Coding (TSC), where temporally-coherent term used to preserve similarity between two similar frames. The optimization coefficients TSC with Sequential Iterative Soft-Thresholding Algorithm (SIATA) equivalent special stacked Recurrent (sRNN) architecture. Further, reduce computational cost...
Computer-aided diagnosis (CAD) in the medical field has received more and attention recent years. One important CAD application is to detect classify breast lesions ultrasound images. Traditionally, process of for classification mainly composed two separated steps: i) locate lesion region interests (ROI); ii) located (ROI) see if they are benign or not. However, due complex structure existence noise images, traditional handcrafted feature based methods usually can not achieve satisfactory...
Human multimodal emotion recognition involves time-series data of different modalities, such as natural language, visual motions, and acoustic behaviors. Due to the variable sampling rates for sequences from collected streams are usually unaligned. The asynchrony across modalities increases difficulty on conducting efficient fusion. Hence, this work mainly focuses fusion unaligned sequences. To end, we propose Progressive Modality Reinforcement (PMR) approach based recent advances crossmodal...
We first propose a new spatio-temporal context distribution feature of interest points for human action recognition. Each video is expressed as set relative XYT coordinates between pairwise in local region. learn global GMM (referred to Universal Background Model, UBM) using the coordinate features from all training videos, and then represent each normalized parameters video-specific adapted GMM. In order capture relationships at different levels, multiple GMMs are utilized describe...
Recent work has demonstrated the effectiveness of domain adaptation methods for computer vision applications. In this work, we propose a new multiple source method called Domain Selection Machine (DSM) event recognition in consumer videos by leveraging large number loosely labeled web images from different sources (e.g., Flickr.com and Photosig.com), which there are no videos. Specifically, first train set SVM classifiers (referred to as classifiers) using SIFT features domains. We...
We propose a new learning method for heterogeneous domain adaptation (HDA), in which the data from source and target are represented by features with different dimensions. Using two projection matrices, we first transform domains into common subspace order to measure similarity between domains. then feature mapping functions augment transformed their original zeros. The existing methods (e.g., SVM SVR) can be readily incorporated our newly proposed augmented representations effectively...
<italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Objective:</i> The purpose of this paper is to propose a novel algorithm for joint optic disc and cup segmentation, which aids the glaucoma detection. xmlns:xlink="http://www.w3.org/1999/xlink">Methods:</i> By assuming shapes regions be elliptical, we proposed an end-to-end region-based convolutional neural network segmentation (referred as JointRCNN). Atrous convolution introduced boost...
Skeleton-based action recognition has been widely investigated considering their strong adaptability to dynamic circumstances and complicated backgrounds. To recognize different actions from skeleton sequences, it is essential crucial model the posture of human represented by its changes in temporal dimension. However, most existing works treat sequences spatial dimension same way, ignoring difference between data which not an optimal way sequences. The each frame proposed be modeled...
Existing cross-domain semantic segmentation methods usually focus on the overall results of whole objects but neglect importance object boundaries. In this work, we find that performance can be considerably boosted if treat boundaries properly. For that, propose a novel method called BAPA-Net, which is based convolutional neural network via Boundary Adaptation and Prototype Alignment, under unsupervised domain adaptation setting. Specifically, first construct additional images by pasting...
Video super-resolution has recently become one of the most important mobile-related problems due to rise video communication and streaming services. While many solutions have been proposed for this task, majority them are too computationally expensive run on portable devices with limited hardware resources. To address problem, we introduce first Mobile AI challenge, where target is develop an end-to-end deep learning-based that can achieve a real-time performance mobile GPUs. The...
Self-training approaches recently achieved promising results in cross-domain object detection, where people iteratively generate pseudo labels for unlabeled target domain samples with a model, and select high-confidence to refine the model. In this work, we reveal that consistency of classification localization predictions are crucial measure quality labels, propose new Harmonious Teacher approach improve self-training detection. particular, first enhance by regularizing scores when training...
Relevant and irrelevant images collected from the Web (e.g., Flickr.com) have been employed as loosely labeled training data for image categorization retrieval. In this work, we propose a new approach to learn robust classifier text-based retrieval (TBIR) using relevant web images, in which explicitly handle noise loose labels of images. Specifically, first partition into clusters. By treating each cluster “bag” bag “instances”, formulate task multi-instance learning problem with constrained...
We propose a visual event recognition framework for consumer domain videos by leveraging large amount of loosely labeled web (e.g., from YouTube). First, we new aligned space-time pyramid matching method to measure the distances between two video clips, where each clip is divided into volumes over multiple levels. calculate pair-wise any and further integrate information different with Integer-flow Earth Mover's Distance (EMD) explicitly align volumes. Second, cross-domain learning in order...
Cross-domain learning methods have shown promising results by leveraging labeled patterns from auxiliary domains to learn a robust classifier for target domain, which has limited number of samples. To cope with the tremendous change feature distribution between different in video concept detection, we propose new cross-domain kernel method. Our method, referred as Domain Transfer SVM (DTSVM), simultaneously learns function and minimizing both structural risk functional mismatch unlabeled...