NFDI4DS | UHH-SEMS - Publication Details

Xiaoxue Gao

ORCID: 0000-0003-1920-5228

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5101856962

Research Areas

Speech Recognition and Synthesis
Music and Audio Processing
Speech and Audio Processing
Natural Language Processing Techniques
Topic Modeling
Speech and dialogue systems
Diverse Musicological Studies
Music Technology and Sound Studies
Cancer Cells and Metastasis
Genetic factors in colorectal cancer
Cancer-related molecular mechanisms research
Acupuncture Treatment Research Studies
Bone Metabolism and Diseases
Emotion and Mood Recognition
Adversarial Robustness in Machine Learning
Drug-Induced Hepatotoxicity and Protection
Effects of Radiation Exposure
Sirtuins and Resveratrol in Medicine
TGF-β signaling in diseases
Pelvic floor disorders treatments
Mesenchymal stem cell research
Renal and related cancers
Greenhouse Technology and Climate Control
Music Therapy and Health
COVID-19 diagnosis using AI

Agency for Science, Technology and Research
2024-2025

Institute for Infocomm Research
2024-2025

National University of Singapore
2007-2024

Anhui University of Traditional Chinese Medicine
2024

Shanxi Academy of Medical Sciences
2023

Shanxi Medical University
2023

East China Normal University
2022

Tianjin University of Traditional Chinese Medicine
2021-2022

Beijing Technology and Business University
2020

Fudan University
2019-2020

Patient-Derived Organoids Predict Chemoradiation Responses of Locally Advanced Rectal Cancer

OPENALEX - Publications

Ye Yao Xiaoya Xu Lifeng Yang Ji Zhu Juefeng Wan and 23 more

10.1016/j.stem.2019.10.010 article EN publisher-specific-oa Cell stem cell 2019-11-21

Automatic Lyrics Transcription of Polyphonic Music With Lyrics-Chord Multi-Task Learning

OPENALEX - Publications

Xiaoxue Gao Chitralekha Gupta Haizhou Li

Lyrics are the words that make up a song, while chords harmonic sets of multiple notes in music. and generally essential information music, i.e. unaccompanied singing vocals mixed with instrumental representing important components polyphonic In traditional lyrics transcription task, we first extract from music then transcribe resulting vocals, where two steps optimized independently. this paper, propose novel end-to-end network architectures designed to disentangle for effective single...

10.1109/taslp.2022.3190742 article EN cc-by-nc-nd IEEE/ACM Transactions on Audio Speech and Language Processing 2022-01-01

PAL: Prompting Analytic Learning with Missing Modality for Multi-Modal Class-Incremental Learning

OPENALEX - Publications

Xianghu Yue Yiming Chen Xueyi Zhang Xiaoxue Gao Mengling Feng and 3 more

Multi-modal class-incremental learning (MMCIL) seeks to leverage multi-modal data, such as audio-visual and image-text pairs, thereby enabling models learn continuously across a sequence of tasks while mitigating forgetting. While existing studies primarily focus on the integration utilization information for MMCIL, critical challenge remains: issue missing modalities during incremental phases. This oversight can exacerbate severe forgetting significantly impair model performance. To bridge...

10.48550/arxiv.2501.09352 preprint EN arXiv (Cornell University) 2025-01-16

TTSlow: Slow Down Text-to-Speech with Efficiency Robustness Evaluations

OPENALEX - Publications

Xiaoxue Gao Yiming Chen Xianghu Yue Yu Tsao Nancy F. Chen

10.1109/taslpro.2025.3533357 article EN IEEE Transactions on Audio Speech and Language Processing 2025-01-01

Emo-DPO: Controllable Emotional Speech Synthesis through Direct Preference Optimization

OPENALEX - Publications

Xiaoxue Gao Chen Zhang Yiming Chen Huayun Zhang Nancy F. Chen

10.1109/icassp49660.2025.10888737 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

MEDSAGE: Enhancing Robustness of Medical Dialogue Summarization to ASR Errors with LLM-generated Synthetic Dialogues

OPENALEX - Publications

Kuluhan Binici Abhinav Ramesh Kashyap Viktor Schlegel Andy T. Liu Vijay Prakash Dwivedi and 4 more

Automatic Speech Recognition (ASR) systems are pivotal in transcribing speech into text, yet the errors they introduce can significantly degrade performance of downstream tasks like summarization. This issue is particularly pronounced clinical dialogue summarization, a low-resource domain where supervised data for fine-tuning scarce, necessitating use ASR models as black-box solutions. Employing conventional augmentation enhancing noise robustness summarization not feasible either due to...

10.1609/aaai.v39i22.34518 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2025-04-11

Organoid modelling identifies that DACH1 functions as a tumour promoter in colorectal cancer by modulating BMP signalling

OPENALEX - Publications

Xiang Hu Long Zhang Yaqi Li Xiaoji Ma Weixing Dai and 13 more

BackgroundDachshund homologue 1 (DACH1) is highly expressed in LGR5+ intestinal stem cells and colorectal tumours. However, the roles of DACH1 cell stemness tumorigenesis remain largely undefined.MethodsWe used immunohistochemistry, western blotting quantitative real-time PCR to analyse expression cancer (CRC) samples. CRISPR/Cas9 gene editing lentiviral vector-mediated overexpression shRNA-mediated knockdown were utilized modulate lines organoids. An organoid-based functional model was...

10.1016/j.ebiom.2020.102800 article EN cc-by-nc-nd EBioMedicine 2020-06-01

Online COVID-19 diagnosis with chest CT images: Lesion-attention deep neural networks

OPENALEX - Publications

Bin Liu Xiaoxue Gao Mengshuang He Fengmao Lv Guosheng Yin

Abstract Chest computed tomography (CT) scanning is one of the most important technologies for COVID-19 diagnosis and disease monitoring, particularly early detection coronavirus. Recent advancements in computer vision motivate more concerted efforts developing AI-driven diagnostic tools to accommodate enormous demands tests globally. To help alleviate burdens on medical systems, we develop a lesion-attention deep neural network (LA-DNN) predict positive or negative with richly annotated...

10.1101/2020.05.11.20097907 preprint EN cc-by-nc medRxiv (Cold Spring Harbor Laboratory) 2020-05-14

NHSS: A speech and singing parallel database

OPENALEX - Publications

Bidisha Sharma Xiaoxue Gao Karthika Vijayan Xiaohai Tian Haizhou Li

10.1016/j.specom.2021.07.002 article EN Speech Communication 2021-07-12

Anti-perimenopausal osteoporosis effects of Erzhi formula via regulation of bone resorption through osteoclast differentiation: A network pharmacology-integrated experimental study

OPENALEX - Publications

Xiaoyan Qin Zi-chang Niu Xiao-ling Han Yun Seok Yang Wei Qiu and 8 more

10.1016/j.jep.2021.113815 article EN Journal of Ethnopharmacology 2021-01-13

Phytochrome-interacting factors regulate seedling growth through ABA signaling

OPENALEX - Publications

Shan Liang Xiaoxue Gao Yijing Wang Huilong Zhang Kexin Yin and 3 more

10.1016/j.bbrc.2020.04.011 article EN Biochemical and Biophysical Research Communications 2020-04-16

PoLyScriber: Integrated Fine-Tuning of Extractor and Lyrics Transcriber for Polyphonic Music

OPENALEX - Publications

Xiaoxue Gao Chitralekha Gupta Haizhou Li

Lyrics transcription of polyphonic music is challenging as the background affects lyrics intelligibility. Typically, can be performed by a two-step pipeline, i.e. singing vocal extraction front end, followed transcriber back where end and are trained separately. Such pipeline suffers from both imperfect mismatch between end. In this work, we propose novel end-to-end integrated fine-tuning framework, that call PoLyScriber, to globally optimize extractor for in music. The experimental results...

10.1109/taslp.2023.3275036 article EN cc-by-nc-nd IEEE/ACM Transactions on Audio Speech and Language Processing 2023-01-01

Token2vec: A Joint Self-Supervised Pre-Training Framework Using Unpaired Speech and Text

OPENALEX - Publications

Xianghu Yue Junyi Ao Xiaoxue Gao Haizhou Li

Self-supervised pre-training has been successful in both text and speech processing. Speech offer different but complementary information. The question is whether we are able to perform a speech-text joint on unpaired text. In this paper, take the idea of self-supervised one step further propose token2vec, novel framework for based discrete representations speech. Specifically, introduce two modality-specific tokenizers Based these tokenizers, convert speech/text sequences into token...

10.1109/icassp49357.2023.10096923 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Genre-Conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music

OPENALEX - Publications

Xiaoxue Gao Chitralekha Gupta Haizhou Li

Lyrics transcription of polyphonic music is challenging not only because the singing vocals are corrupted by background music, but also and style vary across genres, such as pop, metal, hip hop, which affects lyrics intelligibility song in different ways. In this work, we propose to transcribe using a novel genre-conditioned network. The proposed network adopts pre-trained model parameters, incorporates genre adapters between layers capture peculiarities for lyrics-genre pairs, thereby...

10.1109/icassp43922.2022.9747684 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Analysis of Speech and Singing Signals for Temporal Alignment

OPENALEX - Publications

Karthika Vijayan Xiaoxue Gao Haizhou Li

Accurate alignment between singing signal and its spoken lyrics at frame-level is imperative to several applications in processing. As the acoustic characteristics of speech signals differ significantly, finding temporal them not easy. In this paper, we study identify their common properties facilitate alignment. We observe that: (i) excitation source human voice production mechanism largely vary with speaking and, (ii) for same linguistic content, present very different formant patterns....

10.23919/apsipa.2018.8659615 article EN 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2018-11-01

Transferable Adversarial Attacks against ASR

OPENALEX - Publications

Xiaoxue Gao Zexin Li Yiming Chen Cong Liu Haizhou Li

10.1109/lsp.2024.3443711 article IEEE Signal Processing Letters 2024-01-01

Speaker-independent Spectral Mapping for Speech-to-Singing Conversion

OPENALEX - Publications

Xiaoxue Gao Xiaohai Tian Rohan Kumar Das Yi Zhou Haizhou Li

Speech-to-Singing (STS) conversion aims at converting one's reading speech into his/her singing vocal. The prior work was mainly focused on transforming the prosody of to singing, however, there exist prominent differences between spectra and which need be transformed as well. In this paper, we propose make use parallel multi-speaker speak-sing data develop a speaker-independent spectral mapping model, is conditioned i-vector generate target speaker/singer identity. model therefore called...

10.1109/apsipaasc47483.2019.9023056 article EN 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2019-11-01

Bach2 Deficiency Promotes Intestinal Epithelial Regeneration by Accelerating DNA Repair in Intestinal Stem Cells

OPENALEX - Publications

Yuanchuang Li Xinxin Rao Peiyuan Tang Shengzhi Chen Qiang Guo and 11 more

Epithelial regeneration is critical for barrier maintenance and organ function after intestinal injury, although the repair mechanisms are unclear. Here, we found that Bach2 deficiency promotes epithelial cell proliferation during homeostasis. Moreover, genetic inactivation of in mouse epithelium facilitated crypt irradiation, resulting a reduction mortality. RNA-sequencing analysis isolated crypts revealed altered expression numerous genes, including those regulating double-strand break...

10.1016/j.stemcr.2020.12.005 article EN cc-by-nc-nd Stem Cell Reports 2020-12-30

Activated B Lymphocyte Inhibited the Osteoblastogenesis of Bone Mesenchymal Stem Cells by Notch Signaling

OPENALEX - Publications

Mengxue Pan Wei Hong Ye Yao Xiaoxue Gao Yi Zhou and 10 more

Estrogen is very important to the differentiation of B lymphocytes; lymphopoiesis induced by OVX was supposedly involved in osteoporosis. But effects lymphocytes on osteogenic bone mesenchymal stem cells (BMSCs) are not clear. In this study, we detected quality and loss a trabecular electronic universal material testing machine microcomputed tomography (micro-CT) splenectomized-ovariectomy (SPX-OVX) rats. Additionally, changes (B lymphocyte, CD4+ CD8+ T lymphocytes, macrophages) marrow were...

10.1155/2019/8150123 article EN cc-by Stem Cells International 2019-06-02

Personalized Singing Voice Generation Using WaveRNN

OPENALEX - Publications

Xiaoxue Gao Xiaohai Tian Yi Zhou Rohan Kumar Das Haizhou Li

10.21437/odyssey.2020-36 article EN 2020-05-15

Self-Transriber: Few-Shot Lyrics Transcription With Self-Training

OPENALEX - Publications

Xiaoxue Gao Xianghu Yue Haizhou Li

The current lyrics transcription approaches heavily rely on supervised learning with labeled data, but such data are scarce and manual labeling of singing is expensive. How to benefit from unlabeled alleviate limited problem have not been explored for transcription. We propose the first semi-supervised paradigm, Self-Transcriber, by leveraging using selftraining noisy student augmentation. attempt demonstrate possibility a few amount data. Self-Transcriber generates pseudo labels teacher...

10.1109/icassp49357.2023.10094717 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Calycosin inhibits hepatocyte apoptosis in acute liver failure by suppressing the TLR4/NF‐κB pathway: An in vitro study

OPENALEX - Publications

Le Chang Aiqing Zhang Wenjuan Liu Ping Cao Dong Li-xian and 1 more

Acute liver failure (ALF) is a serious disease that difficult to treat owing its unclear pathogenesis. This study aimed investigate the roles and molecular mechanisms of calycosin (CA) in ALF.

10.1002/iid3.935 article EN cc-by Immunity Inflammation and Disease 2023-07-01

NUS-HLT Spoken Lyrics and Singing (SLS) Corpus

OPENALEX - Publications

Xiaoxue Gao Berrak Şişman Rohan Kumar Das Karthika Vijayan

Despite speech-to-singing (STS) voice conversion has been widely studied, a large database for this task not constructed yet. We present new Spoken Lyrics and Singing (SLS) corpus developed at NUS-HLT that can be useful STS. In work, the details of is reported contains 3,058 utterances 90 English songs from 10 professional singers collected in recording studio environment. The spoken lyrics corresponding to are also recorded create database, which we refer as SLS corpus. A comparison singing...

10.1109/icot.2018.8705851 article EN 2018-10-01

Coming Soon ...