A waveform concatenation technique for text-to-speech synthesis

Concatenation (mathematics)
DOI: 10.1007/s10772-017-9463-8 Publication Date: 2017-10-07T02:37:45Z
ABSTRACT
Designing text-to-speech systems capable of producing natural sounding speech segments in different Indian languages is a challenging and ongoing problem. Due to the large number of possible pronunciations in different Indian languages, a number of speech segments are needed to be stored in the speech database while a concatenative speech synthesis technique is used to achieve highly natural speech segments. However, the large speech database size makes it unusable for small hand held devices or human computer interactive systems with limited storage resources. In this paper, we proposed a fraction-based waveform concatenation technique to produce intelligible speech segments from a small footprint speech database. The results of all the experiments performed shows the effectiveness of the proposed technique in producing intelligible speech segments in different Indian languages even with very less storage and computation overhead compared to the existing syllable-based technique.
SUPPLEMENTAL MATERIAL
Coming soon ....
REFERENCES (44)
CITATIONS (11)
EXTERNAL LINKS
PlumX Metrics
RECOMMENDATIONS
FAIR ASSESSMENT
Coming soon ....
JUPYTER LAB
Coming soon ....