Timezone: »

 
Oral
Glow-TTS: A Generative Flow for Text-to-Speech via Monotonic Alignment Search
Jaehyeon Kim · Sungwon Kim · Jungil Kong · Sungroh Yoon

Mon Dec 07 06:15 PM -- 06:30 PM (PST) @ Orals & Spotlights: Language/Audio Applications

Recently, text-to-speech (TTS) models such as FastSpeech and ParaNet have been proposed to generate mel-spectrograms from text in parallel. Despite the advantage, the parallel TTS models cannot be trained without guidance from autoregressive TTS models as their external aligners. In this work, we propose Glow-TTS, a flow-based generative model for parallel TTS that does not require any external aligner. By combining the properties of flows and dynamic programming, the proposed model searches for the most probable monotonic alignment between text and the latent representation of speech on its own. We demonstrate that enforcing hard monotonic alignments enables robust TTS, which generalizes to long utterances, and employing generative flows enables fast, diverse, and controllable speech synthesis. Glow-TTS obtains an order-of-magnitude speed-up over the autoregressive model, Tacotron 2, at synthesis with comparable speech quality. We further show that our model can be easily extended to a multi-speaker setting.

Author Information

Jaehyeon Kim (Kakao Enterprise)
Sungwon Kim (Seoul National University)
Jungil Kong (Kakao Enterprise)
Sungroh Yoon (Seoul National University)

Dr. Sungroh Yoon is Associate Professor of Electrical and Computer Engineering at Seoul National University, Korea. Prof. Yoon received the B.S. degree from Seoul National University, South Korea, and the M.S. and Ph.D. degrees from Stanford University, CA, respectively, all in electrical engineering. He held research positions with Stanford University, CA, Intel Corporation, Santa Clara, CA, and Synopsys, Inc., Mountain View, CA. He was an Assistant Professor with the School of Electrical Engineering, Korea University, from 2007 to 2012. He is currently an Associate Professor with the Department of Electrical and Computer Engineering, Seoul National University, South Korea. Prof. Yoon is the recipient of 2013 IEEE/IEIE Joint Award for Young IT Engineers. His research interests include deep learning, machine learning, data-driven artificial intelligence, and large-scale applications including biomedicine.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors