- Domain Adaptation and Few-Shot Learning
- Medical Image Segmentation Techniques
- Generative Adversarial Networks and Image Synthesis
- Advanced Neural Network Applications
- Radiomics and Machine Learning in Medical Imaging
- Emotion and Mood Recognition
- Face recognition and analysis
- Face and Expression Recognition
- Speech Recognition and Synthesis
- Speech and Audio Processing
- COVID-19 diagnosis using AI
- Medical Imaging Techniques and Applications
- Advanced Image Processing Techniques
- Adversarial Robustness in Machine Learning
- Video Surveillance and Tracking Methods
- EEG and Brain-Computer Interfaces
- Multimodal Machine Learning Applications
- Advanced Neuroimaging Techniques and Applications
- Advanced MRI Techniques and Applications
- Image Processing Techniques and Applications
- AI in cancer detection
- Advanced Vision and Imaging
- Music and Audio Processing
- Voice and Speech Disorders
- Anomaly Detection Techniques and Applications
Yale University
2024-2025
Sanming University
2025
Yale Cancer Center
2024-2025
Hohai University
2017-2024
Harvard University
2019-2024
Guangdong University of Technology
2024
Massachusetts General Hospital
2021-2024
Shenyang Pharmaceutical University
2024
Sichuan University
2010-2024
West China Hospital of Sichuan University
2024
Recent advances in domain adaptation show that deep self-training presents a powerful means for unsupervised adaptation. These methods often involve an iterative process of predicting on target and then taking the confident predictions as pseudo-labels retraining. However, since can be noisy, put overconfident label belief wrong classes, leading to deviated solutions with propagated errors. To address problem, we propose confidence regularized (CRST) framework, formulated self-training. Our...
Consider a study in which 2 groups are followed over time to assess group differences the average rate of change, acceleration, or higher degree polynomial effect. In designing such study, one must decide on duration frequency observation, and number participants. The authors consider how these choices affect statistical power show that depends standardized effect size, sample person-specific reliability coefficient. This reliability, turn, frequency. These relations enable researchers weigh...
To investigate the prevalence and risk factors for poor mental health of Chinese university students during Corona Virus Disease 2019 (COVID-19) pandemic.Chinese nation-wide on-line cross-sectional survey on students, collected between February 12th 17th, 2020. Primary outcome was clinically-relevant posttraumatic stress disorder symptoms. Secondary outcomes included anxiety depressive symptoms, while growth considered as indicator effective coping reaction.Of 2,500 invited 2,038 completed...
A key challenge of facial expression recognition (FER) is to develop effective representations balance the complex distribution intra- and inter- class variations. The latest deep convolutional networks proposed for FER are trained by penalizing misclassification images via softmax loss. In this paper, we show that better performance can be achieved combining metric loss in a unified two fully connected layer branches framework joint optimization. generalized adaptive (N+M)-tuplet clusters...
The RNN-Transducer (RNNT) outperforms classic Automatic Speech Recognition (ASR) systems when a large amount of supervised training data is available. For low-resource languages, the RNNT models overfit, and can not directly take advantage additional text corpora as in ASR systems.We focus on prediction network RNNT, since it believed to be analogous Language Model (LM) systems. We pre-train with text-only data, which helpful. Moreover, removing recurrent layers from network, makes...
In this work, we propose an adversarial unsupervised domain adaptation (UDA) method under inherent conditional and label shifts, in which aim to align the distributions w.r.t. both p(x|y) p(y). Since labels are inaccessible a target domain, conventional UDA methods assume that p(y) is invariant across domains rely on aligning p(x) as alternative alignment. To address this, provide thorough theoretical empirical analysis of novel practical optimization scheme for UDA. Specifically, infer...
This paper reviews the video colorization challenge on New Trends in Image Restoration and Enhancement (NTIRE) workshop, held conjunction with CVPR 2023. The target of this is converting grayscale videos into color better performance temporal consistency. consists two tracks. For Track 1, goal achieving best FID (Fréchet Inception Distance) while being constrained to maintain or improve over baseline method terms temporal-consistency metric. Color Distribution Consistency (CDC) index used as...
Effective training of the deep neural networks requires much data to avoid underdetermined and poor generalization. Data Augmentation alleviates this by using existing more effectively. However standard augmentation produces only limited plausible alternative for example, flipping, distorting, adding noise to, cropping a patch from original samples. In paper, we introduce adversarial autoencoder (AAE) impose feature representations with uniform distribution apply linear interpolation on...
Common spatial pattern (CSP) is an efficient algorithm widely used in feature extraction of EEG-based motor imagery classification. Traditional CSP depends only on filtering, that aims to maximize or minimize the ratio variances filtered EEG signals different classes. Recent advances approaches show temporal filtering also preferable extract discriminative features. In view this perspective, a novel spatio-temporal strategy proposed paper. To improve computational efficiency and alleviate...
Current literature yields mixed results about the effectiveness of relationship education ( RE ) with low‐income participants and those who experience a high level individual or relational distress. Scholars have called for research that examines whether initial levels distress act as moderator outcomes. To test and/or moderate , this study used two samples, one couples received couple‐oriented their partner n = 192 couples) individuals in individual‐oriented by themselves 60 individuals)....
Recent successes of deep learning-based recognition rely on maintaining the content related to main-task label. However, how explicitly dispel noisy signals for better generalization remains an open issue. We systematically summarize detrimental factors as task-relevant/irrelevant semantic variations and unspecified latent variation. In this paper, we cast these problems adversarial minimax game in space. Specifically, propose equipping end-to-end conditional network with ability decompose...
Emotion recognition has become an important component of human–computer interaction systems. Research on emotion based electroencephalogram (EEG) signals are mostly conducted by the analysis all channels' EEG signals. Although some progresses achieved, there still several challenges such as high dimensions, correlation between different features and feature redundancy in realistic experimental process. These have hindered applications to portable systems (or devices). This paper explores how...
This paper considers the problem of image set-based face verification and identification. Unlike traditional single sample (an or a video) setting, this situation assumes availability set heterogeneous collection orderless images videos. The samples can be taken at different check points, identity documents $etc$ . importance each is usually considered either equal based on quality assessment that independent other and/or videos in set. How to model relationship within remains challenge. We...
There is a large amount of public available labeled image-based facial expression recognition datasets. How could these images help for the audio emotion with limited data according to their inherent correlations can be meaningful and challenging task. In this paper, we propose semi-supervised adversarial network that allows knowledge transfer from videos heterogeneous domain hence enhancing performance. Specifically, face image samples are translated spectrograms class-wisely. To harness in...
Semantic segmentation (SS) is an important perception manner for self-driving cars and robotics, which classifies each pixel into a pre-determined class. The widely-used cross entropy (CE) loss-based deep networks has achieved significant progress w.r.t. the mean Intersection-over Union (mIoU). However, loss can not take different importance of class in system account. For example, pedestrians image should be much more than surrounding buildings when make decisions driving, so their results...
Deep neural networks are usually data-starved, but manually annotation can be costly in many specific tasks. For instance, the emotion recognition from audio. However, there is a large amount of public available labeled image-based facial expression datasets. How could these images help for audio with limited data according to their inherent correlations meaningful and challenging task. In this paper, we propose semi-supervised adversarial network that allows knowledge transfer videos...