NFDI4DS | UHH-SEMS - Publication Details

Xulong Zhang

ORCID: 0000-0001-7005-992X

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5028437800

Research Areas

Speech Recognition and Synthesis
Music and Audio Processing
Speech and Audio Processing
Diverse Musicological Studies
Natural Language Processing Techniques
Music Technology and Sound Studies
Topic Modeling
Generative Adversarial Networks and Image Synthesis
Emotion and Mood Recognition
Asian Culture and Media Studies
Face recognition and analysis
Human Motion and Animation
Nasal Surgery and Airway Studies
Handwritten Text Recognition Techniques
Advancements in Battery Materials
Domain Adaptation and Few-Shot Learning
Multimodal Machine Learning Applications
Speech and dialogue systems
Digital Media Forensic Detection
AI in cancer detection
Data Analysis with R
Advanced Battery Materials and Technologies
Scientific Computing and Data Management
Advanced Neural Network Applications
Research Data Management Practices

Ping An (China)
2021-2025

Shenzhen Technology University
2021-2025

Lanzhou University
2015-2024

Chinese Academy of Medical Sciences & Peking Union Medical College
2022-2024

Lamar University
2021-2024

Jinling Institute of Technology
2024

Ningxia University
2021-2023

Foundation for Biomedical Research
2023

Committee on Publication Ethics
2023

Wuhan University of Science and Technology
2023

Impacts of climate change and human activities on grassland vegetation variation in the Chinese Loess Plateau

OPENALEX - Publications

Kai Zheng Jian-Zhou Wei Jiu-Ying Pei Hua Cheng Xulong Zhang and 3 more

10.1016/j.scitotenv.2019.01.022 article EN The Science of The Total Environment 2019-01-04

Avqvc: One-Shot Voice Conversion By Vector Quantization With Applying Contrastive Learning

OPENALEX - Publications

Huaizhen Tang Xulong Zhang Jianzong Wang Ning Cheng Jing Xiao

Voice Conversion(VC) refers to changing the timbre of a speech while retaining discourse content. Recently, many works have focused on disentangle-based learning techniques separate and linguistic content information from signal. Once successful, voice conversion will be feasible straightforward. This paper proposed novel one-shot framework based vector quantization (VQVC) AutoVC, called AVQVC. A new training method is applied VQVC more effectively. The result shows that this approach has...

10.1109/icassp43922.2022.9746369 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Effect of lithium ion concentration on the microstructure evolution and its association with the ionic conductivity of cubic garnet-type nominal Li7Al0.25La3Zr2O12 solid electrolytes

OPENALEX - Publications

Yanhua Zhang Fei Chen Rong Tu Qiang Shen Xulong Zhang and 1 more

10.1016/j.ssi.2015.11.014 article EN Solid State Ionics 2015-11-27

A novel none once per revolution blade tip timing based blade vibration parameters identification method

OPENALEX - Publications

Weimin Wang Xulong Zhang Dongfang Hu Dengpeng Zhang Paul E. Allaire

The vibration caused blade High Cycle Fatigue (HCF) is seriously affects the safety operation of turbomachinery especially for aero-engine. Thus, it crucial important to identify parameters and then evaluate dynamic stress amplitude. Blade Tip Timing (BTT) method one promising solve these problems. While, need a high resolution Once Per Revolution (OPR) signal which difficult get Here, Coupled Vibration Analysis (CVA) identifying by none OPR BTT proposed. assumes that every real has its own...

10.1016/j.cja.2020.01.014 article EN cc-by-nc-nd Chinese Journal of Aeronautics 2020-03-19

New step to improve the accuracy of blade tip timing method without once per revolution

OPENALEX - Publications

Chen Kang Weimin Wang Xulong Zhang Ya Zhang

10.1016/j.ymssp.2019.106321 article EN Mechanical Systems and Signal Processing 2019-08-29

Grazing alters the relationships between species diversity and biomass during community succession in a semiarid grassland

OPENALEX - Publications

Yonghong Zhang Baocheng Jin Xulong Zhang Huihui Wei Qingqing Chang and 6 more

10.1016/j.scitotenv.2023.164155 article EN The Science of The Total Environment 2023-05-13

CycleFlow: Leveraging Cycle Consistency in Flow Matching for Speaker Style Adaptation

OPENALEX - Publications

Ziqi Liang Xulong Zhang Chang Liu Xiaoyang Qu Weifeng Zhao and 1 more

Voice Conversion (VC) aims to convert the style of a source speaker, such as timbre and pitch, any target speaker while preserving linguistic content. However, ground truth converted speech does not exist in non-parallel VC scenario, which induces train-inference mismatch problem. Moreover, existing methods still have an inaccurate pitch low adaptation quality, there is significant disparity between domains. As result, models tend generate with hoarseness, posing challenges achieving...

10.48550/arxiv.2501.01861 preprint EN arXiv (Cornell University) 2025-01-03

Influence of Process Aids on Solid–Liquid Interfacial Properties of Three-Component Hydroxyl-Terminated Polybutadiene Propellants

OPENALEX - Publications

Xulong Zhang Zitong Deng Wenlong Xu Liping Jiang Huixiang Xu and 3 more

The effect of the process aid “OPS” on rheological properties hydroxyl-terminated polybutadiene propellant was investigated by formulating different components high-solid-content slurry, and change in slurry viscosity with shear rate, surface morphology solid-phase particles, contact angle relevant interfaces were characterized. results showed that polyalkene polyamine surfactant OPS could significantly reduce apparent enhance to up a 30% reduction, achieved adjusting interfacial aluminum...

10.3390/polym17030286 article EN Polymers 2025-01-23

Homogeneous Graph Extraction: An Approach to Learning Heterogeneous Graph Embedding

OPENALEX - Publications

Shihao Gao Xiaoyan Yu Yu Jun Cai Xulong Zhang Jianzong Wang and 1 more

10.1109/icassp49660.2025.10889497 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

CycleFlow: Leveraging Cycle Consistency in Flow Matching for Speaker Style Adaptation

OPENALEX - Publications

Ziqi Liang Xulong Zhang Chang Liu Xiaoyang Qu Weifeng Zhao and 1 more

10.1109/icassp49660.2025.10890303 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2025-03-12

Conjoint Shield and Onlay Tip Graft, a Duo for Nasal Tip Camouflage and Refinement in Rhinoplasty

OPENALEX - Publications

Lehao Wu Huan Wang Hengyuan Ma Binghang Li Le Tian and 4 more

10.1007/s00266-025-04827-7 article EN Aesthetic Plastic Surgery 2025-04-07

Singer Identification for Metaverse with Timbral and Middle-Level Perceptual Features

OPENALEX - Publications

Xulong Zhang Jianzong Wang Ning Cheng Jing Xiao

Metaverse is an interactive world that combines reality and virtuality, where participants can be virtual avatars. Anyone hold a concert in hall, users quickly identify the real singer behind idol through identification. Most identification methods are processed using frame-level features. However, expect singer's timbre, music frame includes information, such as melodiousness, rhythm, tonal. It means information noise for features to singers. In this paper, instead of only features, we...

10.1109/ijcnn55064.2022.9892657 article EN 2022 International Joint Conference on Neural Networks (IJCNN) 2022-07-18

DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning

OPENALEX - Publications

Qiqi Wang Xulong Zhang Jianzong Wang Ning Cheng Jing Xiao

Any-to-any voice conversion problem aims to convert voices for source and target speakers, which are out of the training data. Previous works wildly utilize disentangle-based models. The model assumes speech consists content speaker style information untangle them change conversion. focus on reducing dimension get information. But size is hard determine lead overlapping problem. We propose Disentangled Representation Voice Conversion (DRVC) address issue. DRVC an end-to-end self-supervised...

10.1109/icassp43922.2022.9747434 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

EmoMix: Emotion Mixing via Diffusion Models for Emotional Speech Synthesis

OPENALEX - Publications

Haobin Tang Xulong Zhang Jianzong Wang Ning Cheng Jing Xiao

10.21437/interspeech.2023-1317 article EN Interspeech 2022 2023-08-14

ED-TTS: Multi-Scale Emotion Modeling Using Cross-Domain Emotion Diarization for Emotional Speech Synthesis

OPENALEX - Publications

Haobin Tang Xulong Zhang Ning Cheng Jing Xiao Jianzong Wang

Existing emotional speech synthesis methods often utilize an utterance-level style embedding extracted from reference audio, neglecting the inherent multi-scale property of prosody. We introduce ED-TTS, a model that leverages Speech Emotion Diarization (SED) and Recognition (SER) to emotions at different levels. Specifically, our proposed approach integrates emotion by SER with fine-grained frame-level obtained SED. These embeddings are used condition reverse process denoising diffusion...

10.1109/icassp48485.2024.10446467 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

Learning Disentangled Speech Representations with Contrastive Learning and Time-Invariant Retrieval

OPENALEX - Publications

Yimin Deng Huaizhen Tang Xulong Zhang Ning Cheng Jing Xiao and 1 more

Voice conversion refers to transferring speaker identity with well-preserved content. Better disentanglement of speech representations leads better voice conversion. Recent studies have found that phonetic information from input audio has the potential ability well represent Besides, speaker-style modeling pre-trained models making process more complex. To tackle these issues, we introduce an new method named "CTVC" which utilizes disen-tangled contrastive learning and time-invariant...

10.1109/icassp48485.2024.10447283 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024-03-18

A Low Cost Flexible Digital Twin Platform for Spacecraft Lithium-ion Battery Pack Degradation Assessment

OPENALEX - Publications

Yu Peng Xulong Zhang Yuchen Song Datong Liu

Lithium-ion battery is widely utilized in space applications with its significant performance advantages. The safety and reliability of lithium-ion are critical for spacecraft. It essential to assess the degradation estimate state battery. Meanwhile, as a brand new terminology Cyber-physical Systems (CPS), Digital Twin used smart manufacturing industry due advantages on real-time, stability reliability. Thus, can be pack ensure So far, has not been application about management assessment. As...

10.1109/i2mtc.2019.8827160 article EN 2022 IEEE International Instrumentation and Measurement Technology Conference (I2MTC) 2019-05-01

Research on Singing Voice Detection Based on a Long-Term Recurrent Convolutional Network with Vocal Separation and Temporal Smoothing

OPENALEX - Publications

Xulong Zhang Yi Yu Yongwei Gao Xi Chen Wei Li

Singing voice detection or vocal is a classification task that determines whether given audio segment contains singing voices. This plays very important role in vocal-related music information retrieval tasks, such as singer identification. Although humans can easily distinguish between and nonsinging parts, it still difficult for machines to do so. Most existing methods focus on feature engineering with classifiers, which rely the experience of algorithm designer. In recent years, deep...

10.3390/electronics9091458 article EN Electronics 2020-09-07

Vocal Melody Extraction via HRNet-Based Singing Voice Separation and Encoder-Decoder-Based F0 Estimation

OPENALEX - Publications

Yongwei Gao Xulong Zhang Wei Li

Vocal melody extraction is an important and challenging task in music information retrieval. One main difficulty that, most of the time, various instruments singing voices are mixed according to harmonic structure, making it hard identify fundamental frequency (F0) a voice. Therefore, reducing interference accompaniment beneficial pitch estimation In this paper, we first adopted high-resolution network (HRNet) separate vocals from polyphonic music, then designed encoder-decoder estimate...

10.3390/electronics10030298 article EN Electronics 2021-01-26

nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-Shot Multi-speaker text-to-speech

OPENALEX - Publications

Botao Zhao Xulong Zhang Jianzong Wang Ning Cheng Jing Xiao

Multi-speaker text-to-speech (TTS) using a few adaption data is challenge in practical applications. To address that, we propose zero-shot multi-speaker TTS, named nnSpeech, that could synthesis new speaker voice without fine-tuning and only one utterance. Compared with representation module to extract the characteristics of speakers, our method bases on speaker-guided conditional variational autoencoder can generate variable Z, which contains both content information. The latent Z...

10.1109/icassp43922.2022.9746875 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2022-04-27

Iron electrocoagulation activated peracetic acid for efficient degradation of sulfamethoxazole

OPENALEX - Publications

Huan Zhan Xiaoyan Liu Jinbing Huang Xian Liu Xulong Zhang and 2 more

10.1016/j.cherd.2023.10.042 article EN Process Safety and Environmental Protection 2023-10-21

QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis

OPENALEX - Publications

Haobin Tang Xulong Zhang Jianzong Wang Ning Cheng Jing Jian Xiao

Recent expressive text to speech (TTS) models focus on synthesizing emotional speech, but some fine-grained styles such as intonation are neglected. In this paper, we propose QI-TTS which aims better transfer and control further deliver the speaker's questioning intention while transferring emotion from reference speech. We a multi-style extractor extract style embedding two different levels. While sentence level represents emotion, final syllable intonation. For control, use relative...

10.1109/icassp49357.2023.10095623 article EN ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2023-05-05

Preparation of cubic Li7La3Zr2O12 solid electrolyte using a nano-sized core–shell structured precursor

OPENALEX - Publications

Yanhua Zhang Jin Cai Fei Chen Rong Tu Qiang Shen and 2 more

10.1016/j.jallcom.2015.05.085 article EN Journal of Alloys and Compounds 2015-05-14

Five dimensional movement measurement method for rotating blade based on blade tip timing measuring point position tracking

OPENALEX - Publications

Xulong Zhang Weimin Wang Kang Chen Weibo Li Dengpeng Zhang and 1 more

10.1016/j.ymssp.2021.107898 article EN Mechanical Systems and Signal Processing 2021-04-21

Coming Soon ...