Expressive Speech Synthesis via Modeling Expressions with Variational Autoencoder

Autoencoder
DOI: 10.21437/interspeech.2018-1113 Publication Date: 2018-08-28T09:55:42Z
ABSTRACT
Recent advances in neural autoregressive models have improve the performance of speech synthesis (SS).However, as they lack ability to model global characteristics (such speaker individualities or speaking styles), particularly when these not been labeled, making SS systems more expressive is still an open issue.In this paper, we propose combine VoiceLoop, model, with Variational Autoencoder (VAE).This approach, unlike traditional systems, uses VAE explicitly, enabling expressiveness synthesized be controlled unsupervised manner.Experiments using VCTK and Bliz-zard2012 datasets show helps VoiceLoop generate higher quality control expressions its by incorporating into generating process.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (0)
CITATIONS (59)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....