- Domain Adaptation and Few-Shot Learning
- Multimodal Machine Learning Applications
- Text and Document Classification Technologies
- Advanced Neural Network Applications
- Face and Expression Recognition
- Topic Modeling
- COVID-19 diagnosis using AI
- Network Security and Intrusion Detection
- Advanced Image and Video Retrieval Techniques
- Machine Learning and Data Classification
- Music and Audio Processing
- Sentiment Analysis and Opinion Mining
- Remote-Sensing Image Classification
- Anomaly Detection Techniques and Applications
- Advanced Malware Detection Techniques
- Air Quality and Health Impacts
- Air Quality Monitoring and Forecasting
- Natural Language Processing Techniques
- Image Retrieval and Classification Techniques
- Metaheuristic Optimization Algorithms Research
- Internet Traffic Analysis and Secure E-voting
- Atmospheric chemistry and aerosols
- Radiomics and Machine Learning in Medical Imaging
- Advanced Text Analysis Techniques
- Neural Networks and Reservoir Computing
Southwest Jiaotong University
2020-2025
Southwestern University of Finance and Economics
2019-2021
University of Electronic Science and Technology of China
2014-2020
We propose a new approach, called self-motivated pyramid curriculum domain adaptation (PyCDA), to facilitate the of semantic segmentation neural networks from synthetic source domains real target domains. Our approach draws on an insight connecting two existing works: and self-training. Inspired by former, PyCDA constructs which contains various properties about domain. Those are mainly desired label distributions over images, image regions, pixels. By enforcing network observe those...
Human multimodal emotion recognition involves time-series data of different modalities, such as natural language, visual motions, and acoustic behaviors. Due to the variable sampling rates for sequences from collected streams are usually unaligned. The asynchrony across modalities increases difficulty on conducting efficient fusion. Hence, this work mainly focuses fusion unaligned sequences. To end, we propose Progressive Modality Reinforcement (PMR) approach based recent advances crossmodal...
Support vector machine is a classification model which has been widely used in many nonlinear and high dimensional pattern recognition problems. However, it inefficient or impracticable to implement support dealing with large scale training set due its computational difficulties as well the complexity. In this paper, we study problem mainly context of reduction methods reconstruct for machine. We focus on fact uneven distribution instances space propose an efficient self-adaption instance...
Trained with the standard cross entropy loss, deep neural networks can achieve great performance on correctly labeled data. However, if training data is corrupted label noise, models tend to overfit noisy labels, thereby achieving poor generation performance. To remedy this issue, several loss functions have been proposed and demonstrated be robust noise. Although most of stem from Categorical Cross Entropy (CCE) they fail embody intrinsic relationships between CCE other functions. In paper,...
Exploiting photo-realistic synthetic data to train semantic segmentation models has received increasing attention over the past years. However, domain mismatch between and real images will cause a significant performance drop when model trained with is directly applied real-world scenarios. In this paper, we propose new adaptation approach, called Pivot Interaction Transfer (PIT). Our method mainly focuses on constructing pivot information that common knowledge shared across domains as...
Videos flow as the mixture of language, acoustic, and vision modalities. A thorough video understanding needs to fuse time-series data different modalities for prediction. Due variable receiving frequency sequences from each modality, there usually exists inherent asynchrony across collected multimodal streams. Towards an efficient fusion asynchronous streams, we need model correlations between elements The recent Multimodal Transformer (MulT) approach extends self-attention mechanism...
A botnet is one of the most grievous threats to network security since it can evolve into many attacks, such as Denial‐of‐Service (DoS), spam, and phishing. However, current detection methods are inefficient identify unknown botnet. The high‐speed environment makes more difficult. To solve these problems, we improve progress packet processing technologies New Application Programming Interface (NAPI) zero copy propose an efficient quasi‐real‐time intrusion system. Our work detects using...
Fine-tuning pre-trained models for downstream tasks is mainstream in deep learning. However, the are limited to be fine-tuned by data from a specific modality. For example, as visual model, DenseNet cannot directly take textual its input. Hence, although large such or BERT have great potential recognition tasks, they weaknesses leveraging multimodal information, which new trend of This work focuses on fine-tuning unimodal with inputs image-text pairs and expanding them recognition. To this...
Natural language BERTs are trained with corpus in a self-supervised manner. Unlike natural BERTs, vision need paired data to train, which restricts the scale of VL-BERT pretraining. We propose self-training approach that allows training VL-BERTs from unlabeled image data. The proposed method starts our unified conditional model– BERT model can perform zero-shot generation. Given different conditions, generate captions, dense and even questions. use labeled train teacher pseudo captions on...
Multi-view unsupervised feature selection (MUFS) has been demonstrated as an effective technique to reduce the dimensionality of multi-view unlabeled data. The existing methods assume that all views are complete. However, data usually incomplete, i.e., a part instances presented on some but not views. Besides, learning complete similarity graph, important promising technology in MUFS methods, cannot achieve due missing In this paper, we propose complementary and consensus learning-based...
Abstract Concurrent pollution of fine particulate matter (PM 2.5 ) and ozone has been increasingly reported in China recently. Here, we further confirm widespread co‐occurring summertime PM ‐ozone extremes southern China. Annual‐average frequency co‐occurrence is above 50% from 2015 to 2022, especially Pearl River Delta region (72 ± 12%). The spatial extent (city numbers) temporal persistence (co‐occurrence days) for cities with >50% increase at a rate two cities/year 14 days/year,...
The automatic text categorization technique has gained significant attention among researchers because of the increasing availability online information. Therefore, many different learning approaches have been designed in field. Among them, widely used method is Centroid-Based Classifier (CBC) due to its theoretical simplicity and computational efficiency. However, classification accuracy CBC greatly depends on data distribution. Thus it leads a misfit model also poor performance when...
Purpose To train deep learning models to differentiate benign and malignant breast tumors in ultrasound images, we need collect many training samples with clear labels. In general, biopsy results can be used as benign/malignant However, most clinical generally do not have results. Previous works proposed generating labels according Breast Imaging, Reporting Data System (BI‐RADS) ratings. this approach will cause noisy labels, which means that the produced from BI‐RADS diagnoses may...
Colorectal polyp segmentation (CPS), an essential problem in medical image analysis, has garnered growing research attention. Recently, the deep learning-based model completely overwhelmed traditional methods field of CPS, and more CPS have emerged, bringing into learning era. To help researchers quickly grasp main techniques, datasets, evaluation metrics, challenges, trending this paper presents a systematic comprehensive review deep-learning-based from 2014 to 2023, total 115 technical...
We propose a new approach, called self-motivated pyramid curriculum domain adaptation (PyCDA), to facilitate the of semantic segmentation neural networks from synthetic source domains real target domains. Our approach draws on an insight connecting two existing works: and self-training. Inspired by former, PyCDA constructs which contains various properties about domain. Those are mainly desired label distributions over images, image regions, pixels. By enforcing network observe those...
Semantic segmentation, which aims to acquire pixel-level understanding about images, is among the key components in computer vision. To train a good segmentation model for real-world it usually requires huge amount of time and labor effort obtain sufficient annotations images beforehand. get rid such nontrivial burden, one can use simulators automatically generate synthetic that inherently contain full them images. However, training with cannot lead performance due domain difference between...