Bangla Natural Language Processing
In this project, we intend to build a speaker independent Text-to-Speech (TTS) system in Bangla language. To solve the task, we will utilize the SOTA TTS model, Tacotron 2. This model is a combination of two neural network architectures: a modified Tacotron 2 model which is a recurrent sequence-to-sequence model with attention that generates mel-spectrograms from input text. And, a flow-based neural network model named WaveGlow. In this regard, we have created a multi-speaker TTS dataset for Bangladesh Bengali (bn-BD) and Indian Bengali (bn-IN) from Open Speech and Language Resources (OpenSLR) dataset. In our initial experiment, we are interested in the bn-BD dataset. It has audio data of 6 different speakers and corresponding text. Dr. Md Iftekhar Tanveer and Dr. Syeda Sakira Hassan are collaborating with us in this project.