Building Statistical Parametric Multi-speaker Synthesis for Bangladeshi Bangla

Alexander Gutkin

Linne Ha

Martin Jansche

Oddur Kjartansson

Knot Pipatsrisawat

Richard Sproat

5th Workshop on Spoken Language Technologies for Under-resourced languages (SLTU-2016), Procedia Computer Science (Elsevier B.V.), 09--12 May 2016, Yogyakarta, Indonesia, pp. 194-200

Download Google Scholar

Abstract

We present a text-to-speech (TTS) system designed for the dialect of Bengali spoken in Bangladesh. This work is part of an ongoing effort to address the needs of new under-resourced languages. We propose a process for streamlining the bootstrapping of TTS systems for under-resourced languages. First, we use crowdsourcing to collect the data from multiple ordinary speakers, each speaker recording small amount of sentences. Second, we leverage an existing text normalization system for a related language (Hindi) to bootstrap a linguistic front-end for Bangla. Third, we employ statistical techniques to construct multi-speaker acoustic models using Long Short-term Memory Recurrent Neural Network (LSTM-RNN) and Hidden Markov Model (HMM) approaches. We then describe our experiments that show that the resulting TTS voices score well in terms of their perceived quality as measured by Mean Opinion Score (MOS) evaluations.

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations  & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Building Statistical Parametric Multi-speaker Synthesis for Bangladeshi Bangla

Abstract

Research Areas

Learn more about how we conduct our research

Defining the technology of today and tomorrow.

Philosophy

People

Teams

AI/ML Foundations & Capabilities

Algorithms & Optimization

Computing Paradigms

Responsible Human-Centric Technology

Science & Societal Impact

Projects

Publications

Resources

Shaping the future, together.

Student programs

Faculty programs

Conferences & events

Building Statistical Parametric Multi-speaker Synthesis for Bangladeshi Bangla

Abstract

Research Areas

Learn more about how we conduct our research

AI/ML Foundations  & Capabilities