NFDI4DS | UHH-SEMS - Publication Details

WaveNODE: A Continuous Normalizing Flow for Speech Synthesis

FOS: Computer and information sciences Computer Science - Machine Learning Sound (cs.SD) Computer Science - Computation and Language Audio and Speech Processing (eess.AS) FOS: Electrical engineering, electronic engineering, information engineering 01 natural sciences Computation and Language (cs.CL) Computer Science - Sound Electrical Engineering and Systems Science - Audio and Speech Processing 0105 earth and related environmental sciences Machine Learning (cs.LG)

DOI: 10.48550/arxiv.2006.04598 Publication Date: 2020-01-01

Abstract Supplemental Material References Cited by

AUTHORS (6)

Hyeongju Kim

Hyeon Seung Lee

Woo Hyun Kang

Sung Jun Cheon

Byoung Jin Choi

Nam Soo Kim

ABSTRACT

In recent years, various flow-based generative models have been proposed to generate high-fidelity waveforms in real-time. However, these require either a well-trained teacher network or number of flow steps making them memory-inefficient. this paper, we propose novel model called WaveNODE which exploits continuous normalizing for speech synthesis. Unlike the conventional models, places no constraint on function used operation, thus allowing usage more flexible and complex functions. Moreover, can be optimized maximize likelihood without requiring any auxiliary loss terms. We experimentally show that achieves comparable performance with fewer parameters compared vocoders.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES ()

CITATIONS ()

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products

PlumX Metrics

WaveNODE: A Continuous Normalizing Flow for Speech Synthesis

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....