NFDI4DS | UHH-SEMS - Publication Details

Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

Image warping Abstraction

DOI: 10.21437/interspeech.2021-1461 Publication Date: 2021-08-27T05:59:39Z

Abstract Supplemental Material References Cited by

AUTHORS (7)

Isaac Elias

Heiga Zen

Jonathan Shen

Yu Zhang

Ye Jia

R.J. Skerry-Ryan

Yonghui Wu

ABSTRACT

This paper introduces Parallel Tacotron 2, a non-autoregressive neural text-to-speech model with fully differentiable duration which does not require supervised signals.The is based on novel attention mechanism and an iterative reconstruction loss Soft Dynamic Time Warping, this can learn token-frame alignments as well token durations automatically.Experimental results show that 2 outperforms baselines in subjective naturalness several diverse multi speaker evaluations.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (37)

EXTERNAL LINKS

OPENAIRE - Products OPENALEX - Publications CROSSREF - Publications

PlumX Metrics

Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....