- Speech and Audio Processing
- Speech Recognition and Synthesis
- Music and Audio Processing
- Blind Source Separation Techniques
- Artificial Intelligence in Healthcare
- Digital Mental Health Interventions
- Mental Health via Writing
- Hearing Loss and Rehabilitation
- Misinformation and Its Impacts
- Machine Learning in Healthcare
- Vaccine Coverage and Hesitancy
- Health Literacy and Information Accessibility
- Mental Health Research Topics
- Opinion Dynamics and Social Influence
- Meteorological Phenomena and Simulations
- Electronic Health Records Systems
- Complex Network Analysis Techniques
- Artificial Intelligence in Healthcare and Education
- COVID-19 epidemiological studies
- Sentiment Analysis and Opinion Mining
- Flood Risk Assessment and Management
- Underwater Acoustics Research
- Influenza Virus Research Studies
- Advanced Text Analysis Techniques
- Precipitation Measurement and Analysis
Johns Hopkins University
2022-2025
Tsinghua University
2021-2023
Institute of Atmospheric Physics
2023
Chinese Academy of Sciences
2023
University of Chinese Academy of Sciences
2023
Kuaishou (China)
2022
Abstract Objective To develop and apply a natural language processing (NLP)-based approach to analyze public sentiments on social media their geographic pattern in the United States toward coronavirus disease 2019 (COVID-19) vaccination. We also aim provide insights facilitate understanding of attitudes concerns regarding COVID-19 Methods collected Tweet posts by residents after dissemination vaccine. performed sentiment analysis based Bidirectional Encoder Representations from Transformers...
ABSTRACT Objective This scoping review aims to identify and understand the role of artificial intelligence in application integrated electronic health records (EHRs) patient-generated data (PGHD) care, including clinical decision support, care quality, patient safety. We focused on that combined PGHD EHR data, we investigated (AI) care. Methods used Preferred Reporting Items for Systematic Reviews Meta-Analyses (PRISMA) guidelines search articles six databases: PubMed, Embase, Web Science,...
The emerging health technologies and digital services provide effective ways of collecting information gathering patient-generated data (PGHD), which a more holistic view patient's quality life over time, increase visibility into adherence to treatment plan or study protocol, enable timely intervention before costly care episode.Through national cross-sectional survey in the United States, we aimed describe compare characteristics populations with without mental issues (depression anxiety...
This study aims to propose a novel approach for enhancing clinical prediction models by combining structured and unstructured data with multimodal fusion. We presented comprehensive framework that integrated sources, including textual notes, electronic health records (EHRs), relevant from National Electronic Injury Surveillance System (NEISS) datasets. proposed hybrid fusion method, which incorporated state-of-the-art pre-trained language model, integrate text EHR other thereby capturing...
Auditory Attention Decoding (AAD) algorithms play a crucial role in isolating desired sound sources within challenging acoustic environments directly from brain activity. Although recent research has shown promise AAD using shallow representations such as auditory envelope and spectrogram, there been limited exploration of deep Self-Supervised (SS) on larger scale. In this study, we undertake comprehensive investigation into the performance linear decoders across 12 2 representations,...
Latent diffusion models have shown promising results in text-to-audio (T2A) generation tasks, yet previous encountered difficulties quality, computational cost, sampling, and data preparation. In this paper, we introduce EzAudio, a transformer-based T2A model, to handle these challenges. Our approach includes several key innovations: (1) We build the model on latent space of 1D waveform Variational Autoencoder (VAE), avoiding complexities handling 2D spectrogram representations using an...
Precipitation nowcasting is a crucial element in current weather service systems. Data-driven methods have proven highly advantageous, due to their flexibility utilizing detailed initial hydrometeor observations, and capability approximate meteorological dynamics effectively given sufficient training data. However, data-driven often encounter severe approximation/optimization errors, rendering predictions associated uncertainty estimates unreliable. Here we develop probabilistic diffusion...
Social network data often contain missing values because of the sensitive nature information collected and dependency among actors. As a response, imputation methods including simple ones constructed from structural characteristics more complicated model-based have been developed. Although past studies explored influence on social networks effectiveness procedures in many conditions, current study aims to evaluate extensive set eight techniques (i.e., null-tie, Reconstruction, Preferential...
Common target sound extraction (TSE) approaches primarily relied on discriminative in order to separate the while minimizing interference from unwanted sources, with varying success separating background. This study introduces DPM-TSE, a generative method based diffusion probabilistic modeling (DPM) for Target Sound Extraction (TSE), achieve both cleaner renderings as well improved separability sounds. The technique also tackles noise floor of DPM by introducing correction schedules and...
Speech separation, the task of isolating multiple speech sources from a mixed audio signal, remains challenging in noisy environments. In this paper, we propose generative correction method to enhance output discriminative separator. By leveraging corrector based on diffusion model, refine separation process for single-channel mixture by removing noises and perceptually unnatural distortions. Furthermore, optimize model using predictive loss streamline model's reverse into single step...
Generative voice technologies are rapidly evolving, offering opportunities for more personalized and inclusive experiences. Traditional one-shot conversion (VC) requires a target recording during inference, limiting ease of usage in generating desired timbres. Text-guided generation offers an intuitive solution to convert voices "DreamVoices" according the users' needs. Our paper presents two major contributions VC technology: (1) DreamVoiceDB, robust dataset timbre annotations 900 speakers...
Generative voice technologies are rapidly evolving, offering opportunities for more personalized and inclusive experiences. Traditional one-shot conversion (VC) requires a target recording during inference, limiting ease of usage in generating desired timbres. Text-guided generation offers an intuitive solution to convert voices "DreamVoices" according the users' needs. Our paper presents two major contributions VC technology: (1) DreamVoiceDB, robust dataset timbre annotations 900 speakers...
Speech separation, the task of isolating multiple speech sources from a mixed audio signal, remains challenging in noisy environments. In this paper, we propose generative correction method to enhance output discriminative separator. By leveraging corrector based on diffusion model, refine separation process for single-channel mixture by removing noises and perceptually unnatural distortions. Furthermore, optimize model using predictive loss streamline model's reverse into single step...
In this paper, we introduce SoloAudio, a novel diffusion-based generative model for target sound extraction (TSE). Our approach trains latent diffusion models on audio, replacing the previous U-Net backbone with skip-connected Transformer that operates features. SoloAudio supports both audio-oriented and language-oriented TSE by utilizing CLAP as feature extractor sounds. Furthermore, leverages synthetic audio generated state-of-the-art text-to-audio training, demonstrating strong...
In this paper, we introduce SSR-Speech, a neural codec autoregressive model designed for stable, safe, and robust zero-shot text-based speech editing text-to-speech synthesis. SSR-Speech is built on Transformer decoder incorporates classifier-free guidance to enhance the stability of generation process. A watermark Encodec proposed embed frame-level watermarks into edited regions so that which parts were can be detected. addition, waveform reconstruction leverages original unedited segments,...
ABSTRACT Objective To develop and apply a natural language processing (NLP) – based approach to analyze public sentiments on social media their geographic pattern in the United States toward COVID-19 vaccination. We also provide insights facilitate understanding of attitudes concerns regarding Methods collected Tweet posts by residents after official dissemination vaccine. performed sentiment analysis Bidirectional Encoder Representations from Transformers (BERT) qualitative content...
ABSTRACT Objective To describe and compare characteristics of the population with without mental health issues (depression or anxiety disorder), including physical health, sleep, alcohol use. We also examined patterns social networking service use, patient-generated data on digital platforms, information sharing attitudes activities. Methods drew from National Cancer Institute’s 2019 Health Information Trends Survey (HINTS). Participants were divided into two groups by status. Then, we...
<sec> <title>BACKGROUND</title> The emerging health technologies and digital services provide effective ways of collecting information gathering patient-generated data (PGHD), which a more holistic view patient’s quality life over time, increase visibility into adherence to treatment plan or study protocol, enable timely intervention before costly care episode. </sec> <title>OBJECTIVE</title> Through national cross-sectional survey in the United States, we aimed describe compare...
Pitch correction is the process of adjusting original pitch a recording or live performance in order to fit it specific key match target profile. systems typical consist several stages: estimation, curve modification, and resynthesis audio with curve. Unfortunately, often leads significant artifacts that degrade overall quality modified audio, rendering unnatural unpleasant. In this work, we introduce Diff-Pitcher <sup xmlns:mml="http://www.w3.org/1998/Math/MathML"...
Common target sound extraction (TSE) approaches primarily relied on discriminative in order to separate the while minimizing interference from unwanted sources, with varying success separating background. This study introduces DPM-TSE, a first generative method based diffusion probabilistic modeling (DPM) for extraction, achieve both cleaner renderings as well improved separability sounds. The technique also tackles common background noise issues DPM by introducing correction schedules and...
Auditory Attention Decoding (AAD) algorithms play a crucial role in isolating desired sound sources within challenging acoustic environments directly from brain activity. Although recent research has shown promise AAD using shallow representations such as auditory envelope and spectrogram, there been limited exploration of deep Self-Supervised (SS) on larger scale. In this study, we undertake comprehensive investigation into the performance linear decoders across 12 2 representations,...
Sentiment analysis has traditionally leveraged information from text data. More recently, it become increasingly clear that multimodal data provides a rich space to drastically boost interpretation of human sentiments by harnessing across multiple modalities. In this study, we incorporate pre-trained feature extractors and propose multitask training strategy improve modality representations for Multimodal Analysis (MSA). The experimental results on the CH-SIMS v2 dataset demonstrate superior...