Emotional End-to-end Neural Speech Synthesizer | Awesome LLM Papers Contribute to Awesome LLM Papers

Emotional End-to-end Neural Speech Synthesizer

Younggun Lee, Azam Rabiee, Soo-Young Lee . Arxiv 2017 – 61 citations

[Paper]   Search on Google Scholar   Search on Semantic Scholar
Uncategorized

In this paper, we introduce an emotional speech synthesizer based on the recent end-to-end neural model, named Tacotron. Despite its benefits, we found that the original Tacotron suffers from the exposure bias problem and irregularity of the attention alignment. Later, we address the problem by utilization of context vector and residual connection at recurrent neural networks (RNNs). Our experiments showed that the model could successfully train and generate speech for given emotion labels.

Similar Work