Tacotron over MXNet
A tech demo of MXNet capabilities consisting of a Tacotron implementation. This is a work in progress.
This project was made during the 8 weeks from 10-2017 to 12-2017 at the PiCampus AI School in Rome.
List of functionalities and TODOs
[x] Multithreading data iterator
[x] DSP tools
[x] CBHG module for spectrograms
[x] Basic seq2seq example for string reverse. It we'll be used as Tacotron backbone
[ ] Encoder with CBHG
[ ] Attention model
[ ] Custom decoder for processing r * mel_bands spectrograms frames for each time step during the cell unrolling
[ ] Switch to MXNet 1.0
[ ] Switch to Gluon
[ ] Clean up and organize code for better understanding
Getting Started
Using the default setting, a simple dataset will be used as training. Predictions samples will be generated at the end of the training phase.
If you want to train over a big dataset, Kyubyong has cut and formatted this English bible. You can find his dataset here and the CSV text here .
Prerequisites
This project has been developed on
Authors
This project was developed by Alberto Massidda and Stefano Artuso during Pi School's AI programme in Fall 2017.
Acknowledgments
Thanks to Roberto Barra Chicote for supporting us
Thanks to Keith Ito https://github.com/keithito, Kyubyong Park https://github.com/Kyubyong for making us start diving in