NFDI4DS | UHH-SEMS - Publication Details

StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion

FOS: Computer and information sciences Computer Science - Machine Learning Sound (cs.SD) Machine Learning (stat.ML) 02 engineering and technology Computer Science - Sound Machine Learning (cs.LG) Statistics - Machine Learning Audio and Speech Processing (eess.AS) FOS: Electrical engineering, electronic engineering, information engineering 0202 electrical engineering, electronic engineering, information engineering Electrical Engineering and Systems Science - Audio and Speech Processing

DOI: 10.21437/interspeech.2019-2236 Publication Date: 2019-09-13T20:32:51Z

Abstract Supplemental Material References Cited by

AUTHORS (4)

Takuhiro Kaneko

Hirokazu Kameoka

Kou Tanaka

Nobukatsu Hojo

ABSTRACT

Non-parallel multi-domain voice conversion (VC) is a technique for learning mappings among multiple domains without relying on parallel data.This important but challenging owing to the requirement of and nonavailability explicit supervision.Recently, StarGAN-VC has garnered attention its ability solve this problem only using single generator.However, there still gap between real converted speech.To bridge gap, we rethink conditional methods StarGAN-VC, which are key components achieving non-parallel VC in model, propose an improved variant called StarGAN-VC2.Particularly, two aspects: training objectives network architectures.For former, source-and-target adversarial loss that allows all source domain data be convertible target data.For latter, introduce modulation-based method can transform modulation acoustic feature domain-specific manner.We evaluated our multi-speaker VC.An objective evaluation demonstrates proposed improve speech quality terms both global local structure measures.Furthermore, subjective shows StarGAN-VC2 outperforms naturalness speaker similarity.

SUPPLEMENTAL MATERIAL

Coming soon ....

REFERENCES (0)

CITATIONS (93)

EXTERNAL LINKS

OPENALEX - Publications OPENAIRE - Products CROSSREF - Publications

PlumX Metrics

StarGAN-VC2: Rethinking Conditional Methods for StarGAN-Based Voice Conversion

RECOMMENDATIONS

FAIR ASSESSMENT

Coming soon ....

JUPYTER LAB

Coming soon ....