NFDI4DS | UHH-SEMS - Publication Details

Yanqing Liu

ORCID: 0000-0003-0412-8805

Publications

Citations

Views

---

Saved

---

About

Contact & Profiles

A5100360935

Research Areas

Speech Recognition and Synthesis
Stability and Control of Uncertain Systems
Speech and Audio Processing
Natural Language Processing Techniques
Control Systems and Identification
Fault Detection and Control Systems
Topic Modeling
Cooperative Communication and Network Coding
Music and Audio Processing
Advanced MIMO Systems Optimization
Advanced Control Systems Optimization
Neural Networks Stability and Synchronization
Vibration Control and Rheological Fluids
Wireless Communication Security Techniques
Speech and dialogue systems
Cognitive Radio Networks and Spectrum Sensing
Advanced Wireless Communication Techniques
Structural Engineering and Vibration Analysis
Energy Harvesting in Wireless Networks
Vehicle Dynamics and Control Systems
PAPR reduction in OFDM
Domain Adaptation and Few-Shot Learning
Advanced Graph Theory Research
Plant and Fungal Interactions Research
Error Correcting Code Techniques

Bank of China
2024

Microsoft Research Asia (China)
2022-2024

National Supercomputing Center in Wuxi
2023

Xidian University
2022-2023

Microsoft (United States)
2020-2023

Beijing Forestry University
2022-2023

State Administration of Traditional Chinese Medicine of the People's Republic of China
2023

Yangzhou University
2023

Tsinghua University
2022

Jiangnan University
2013-2022

Neural Speech Synthesis with Transformer Network

OPENALEX - Publications

Naihan Li Shujie Liu Yanqing Liu Sheng Zhao Ming Liu

Although end-to-end neural text-to-speech (TTS) methods (such as Tacotron2) are proposed and achieve state-of-theart performance, they still suffer from two problems: 1) low efficiency during training inference; 2) hard to model long dependency using current recurrent networks (RNNs). Inspired by the success of Transformer network in machine translation (NMT), this paper, we introduce adapt multi-head attention mechanism replace RNN structures also original Tacotron2. With help...

10.1609/aaai.v33i01.33016706 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2019-07-17

Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers

OPENALEX - Publications

Chengyi Wang Sanyuan Chen Yu Wu Ziqiang Zhang Long Zhou and 8 more

We introduce a language modeling approach for text to speech synthesis (TTS). Specifically, we train neural codec model (called Vall-E) using discrete codes derived from an off-the-shelf audio model, and regard TTS as conditional task rather than continuous signal regression in previous work. During the pre-training stage, scale up training data 60K hours of English which is hundreds times larger existing systems. Vall-E emerges in-context learning capabilities can be used synthesize...

10.48550/arxiv.2301.02111 preprint EN other-oa arXiv (Cornell University) 2023-01-01

NaturalSpeech: End-to-End Text-to-Speech Synthesis with Human-Level Quality

OPENALEX - Publications

Xu Tan Jiawei Chen Haohe Liu Jian Cong Chen Zhang and 9 more

Text-to-speech (TTS) has made rapid progress in both academia and industry recent years. Some questions naturally arise that whether a TTS system can achieve human-level quality, how to define/judge it. In this paper, we answer these by first defining the quality based on statistical significance of subjective measure introducing appropriate guidelines judge it, then developing called NaturalSpeech achieves benchmark datasets. Specifically, leverage variational auto-encoder (VAE) for...

10.1109/tpami.2024.3356232 article EN IEEE Transactions on Pattern Analysis and Machine Intelligence 2024-01-01

Developing RNN-T Models Surpassing High-Performance Hybrid Models with Customization Capability

OPENALEX - Publications

Jinyu Li Rui Zhao Zhong Meng Yanqing Liu Wenning Wei and 6 more

Because of its streaming nature, recurrent neural network transducer (RNN-T) is a very promising end-to-end (E2E) model that may replace the popular hybrid for automatic speech recognition.In this paper, we describe our recent development RNN-T models with reduced GPU memory consumption during training, better initialization strategy, and advanced encoder modeling future lookahead.When trained Microsoft's 65 thousand hours anonymized training data, developed surpasses well both recognition...

10.21437/interspeech.2020-3016 article EN Interspeech 2022 2020-10-25

NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers

OPENALEX - Publications

Kai Shen Zeqian Ju Xu Tan Yanqing Liu Yichong Leng and 4 more

Scaling text-to-speech (TTS) to large-scale, multi-speaker, and in-the-wild datasets is important capture the diversity in human speech such as speaker identities, prosodies, styles (e.g., singing). Current large TTS systems usually quantize into discrete tokens use language models generate these one by one, which suffer from unstable prosody, word skipping/repeating issue, poor voice quality. In this paper, we develop NaturalSpeech 2, a system that leverages neural audio codec with residual...

10.48550/arxiv.2304.09116 preprint EN other-oa arXiv (Cornell University) 2023-01-01

Exponential stability of fuzzy cellular neural networks with constant and time-varying delays

OPENALEX - Publications

Yanqing Liu Wansheng Tang

10.1016/j.physleta.2004.01.064 article EN Physics Letters A 2004-02-06

Experimental unsupervised learning of non-Hermitian knotted phases with solid-state spins

OPENALEX - Publications

Yefei Yu Li-Wei Yu Wengang Zhang Huili Zhang Xiaolong Ouyang and 3 more

Abstract Non-Hermiticity has widespread applications in quantum physics. It brings about distinct topological phases without Hermitian counterparts, and gives rise to the fundamental challenge of phase classification. Here, we report an experimental demonstration unsupervised learning non-Hermitian with nitrogen-vacancy center platform. In particular, implement twister model, which hosts peculiar knotted phases, a solid-state simulator consisting electron spin nearby 13 C nuclear diamond. By...

10.1038/s41534-022-00629-w article EN cc-by npj Quantum Information 2022-09-24

NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

OPENALEX - Publications

Xu Tan Jiawei Chen Haohe Liu Jian Cong Chen Zhang and 9 more

Text to speech (TTS) has made rapid progress in both academia and industry recent years. Some questions naturally arise that whether a TTS system can achieve human-level quality, how define/judge quality it. In this paper, we answer these by first defining the based on statistical significance of subjective measure introducing appropriate guidelines judge it, then developing called NaturalSpeech achieves benchmark dataset. Specifically, leverage variational autoencoder (VAE) for end-to-end...

10.48550/arxiv.2205.04421 preprint EN other-oa arXiv (Cornell University) 2022-01-01

Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling

OPENALEX - Publications

Ziqiang Zhang Long Zhou Chengyi Wang Sanyuan Chen Yu Wu and 8 more

We propose a cross-lingual neural codec language model, VALL-E X, for speech synthesis. Specifically, we extend and train multi-lingual conditional model to predict the acoustic token sequences of target by using both source text as prompts. X inherits strong in-context learning capabilities can be applied zero-shot text-to-speech synthesis speech-to-speech translation tasks. Experimental results show that it generate high-quality in via just one utterance prompt while preserving unseen...

10.48550/arxiv.2303.03926 preprint EN other-oa arXiv (Cornell University) 2023-01-01

DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021

OPENALEX - Publications

Yanqing Liu Zhihang Xu Gang Wang Kuan Chen Bohan Li and 4 more

10.21437/blizzard.2021-14 article EN 2021-10-23

Spectrum Sharing in MIMO Cognitive Radio Networks Based on Cooperative Game Theory

OPENALEX - Publications

Yanqing Liu Liang Dong

In a MIMO cognitive radio network, multiple secondary users sense the spatial channels and share spectrum use with incumbent primary users. Each transmitter competes others to increase its own information rate while generating limited total interference receivers. order maximize sum-rate of problem user transmission is modeled as cooperative game. The strategy each transmit covariance matrix, utility an approximation rate. negotiate over allocation budget reach at bargaining solution that...

10.1109/twc.2014.2331287 article EN IEEE Transactions on Wireless Communications 2014-06-24

Event‐triggered constrained control of positive systems with input saturation

OPENALEX - Publications

Yanyan Yin Zongli Lin Yanqing Liu Kok Lay Teo

Summary This paper addresses the problem of event‐triggered stabilization for positive systems subject to input saturation, where state variables are in nonnegative orthant. An linear feedback law is constructed. By expressing saturated on a convex hull group auxiliary laws, we establish conditions under which closed‐loop system asymptotically stable with given set contained domain attraction. On basis these conditions, designing gain and event‐triggering strategy attaining largest...

10.1002/rnc.4097 article EN publisher-specific-oa International Journal of Robust and Nonlinear Control 2018-04-02

Neural Speech Synthesis with Transformer Network

OPENALEX - Publications

Naihan Li Shujie Liu Yanqing Liu Sheng Zhao Ming Liu and 1 more

Although end-to-end neural text-to-speech (TTS) methods (such as Tacotron2) are proposed and achieve state-of-the-art performance, they still suffer from two problems: 1) low efficiency during training inference; 2) hard to model long dependency using current recurrent networks (RNNs). Inspired by the success of Transformer network in machine translation (NMT), this paper, we introduce adapt multi-head attention mechanism replace RNN structures also original Tacotron2. With help...

10.48550/arxiv.1809.08895 preprint EN other-oa arXiv (Cornell University) 2018-01-01

Distributed leader-following consensus of nonlinear multi-agent systems with nonlinear input dynamics

OPENALEX - Publications

Yi Shi Yanyan Yin Song Wang Yanqing Liu Fei Liu

10.1016/j.neucom.2018.01.059 article EN Neurocomputing 2018-02-02

Univariate time series classification using information geometry

OPENALEX - Publications

Jiancheng Sun Yong Yang Yanqing Liu Chunlin Chen Wenyuan Rao and 1 more

10.1016/j.patcog.2019.05.040 article EN Pattern Recognition 2019-05-31

Robust fault detection of singular Markov jump systems with partially unknown information

OPENALEX - Publications

Yanyan Yin Jiangbin Shi Fei Liu Yanqing Liu

10.1016/j.ins.2020.05.069 article EN Information Sciences 2020-05-29

RobuTrans: A Robust Transformer-Based Text-to-Speech Model

OPENALEX - Publications

Naihan Li Yanqing Liu Yu Wu Shujie Liu Sheng Zhao and 1 more

Recently, neural network based speech synthesis has achieved outstanding results, by which the synthesized audios are of excellent quality and naturalness. However, current TTS models suffer from robustness issue, results in abnormal (bad cases) especially for unusual text (unseen context). To build a model can synthesize both natural stable audios, this paper, we make deep analysis why previous not robust, on propose RobuTrans (Robust Transformer), robust Transformer. Comparing to...

10.1609/aaai.v34i05.6337 article EN Proceedings of the AAAI Conference on Artificial Intelligence 2020-04-03

Mixed-Phoneme BERT: Improving BERT with Mixed Phoneme and Sup-Phoneme Representations for Text to Speech

OPENALEX - Publications

Guangyan Zhang Kaitao Song Xu Tan Daxin Tan Yuzi Yan and 6 more

10.21437/interspeech.2022-621 article EN Interspeech 2022 2022-09-16

FELLE: Autoregressive Speech Synthesis with Token-Wise Coarse-to-Fine Flow Matching

OPENALEX - Publications

Hui Wang Shujie Liu Lingwei Meng Jinyu Li Yifan Yang and 7 more

To advance continuous-valued token modeling and temporal-coherence enforcement, we propose FELLE, an autoregressive model that integrates language with token-wise flow matching. By leveraging the nature of models generative efficacy matching, FELLE effectively predicts tokens (mel-spectrograms). For each token, modifies general prior distribution in matching by incorporating information from previous step, improving coherence stability. Furthermore, to enhance synthesis quality, introduces a...

10.48550/arxiv.2502.11128 preprint EN arXiv (Cornell University) 2025-02-16

DelightfulTTS 2: End-to-End Speech Synthesis with Adversarial Vector-Quantized Auto-Encoders

OPENALEX - Publications

Yanqing Liu Ruiqing Xue Lei He Xu Tan Sheng Zhao

10.21437/interspeech.2022-277 article EN Interspeech 2022 2022-09-16

Towards Contextual Spelling Correction for Customization of End-to-End Speech Recognition Systems

OPENALEX - Publications

Xiaoqiang Wang Yanqing Liu Jinyu Li Veljko Miljanic Sheng Zhao and 1 more

Contextual biasing is an important and challenging task for end-to-end automatic speech recognition (ASR) systems, which aims to achieve better performance by the ASR system particular context phrases such as person names, music list, proper nouns, etc. Existing methods mainly include contextual LM adding bias encoder into models. In this work, we introduce a novel approach do spelling correction model on top of system. We incorporate information sequence-to-sequence with shared encoder. The...

10.1109/taslp.2022.3205753 article EN IEEE/ACM Transactions on Audio Speech and Language Processing 2022-01-01

Vibration Isolation by a Variable Stiffness and Damping System

OPENALEX - Publications

Yanqing Liu Hiroshi MATSUHISA Hideo Utsuno Jeong Gyu PARK

Most passive vibration isolation systems are composed of springs and dampers. Although it is possible to improve the performance by active control, complexity, power requirements cost such a system have restricted its use. A with variable damping practical has good in high frequency region, but was found not responses low region. On base on-off control method, stiffness method combination were proposed. Comparison among proposed methods conventional showed that had best properties whole new...

10.1299/jsmec.48.305 article EN JSME International Journal Series C 2005-01-01

Vibration Control by a Variable Damping and Stiffness System with Magnetorheological Dampers

OPENALEX - Publications

Yanqing Liu Hiroshi MATSUHISA Hideo Utsuno Jeong Gyu PARK

A vibration isolation system with variable damping and stiffness control is practical has good performances. However, conventional devices of are usually complicated. magnetorheological (MR) fluid damper only needs a small electric current to provide the magnetic field. It easy achieve an MR in systems. In this paper, two dampers series were used for system. The passive, damping, stiffness, systems investigated experiment theoretical calculation. time frequency responses sinusoidal, sweep...

10.1299/jsmec.49.411 article EN JSME International Journal Series C 2006-01-01

Coming Soon ...