Speech-to-Text-WaveNet : End-to-end sentence level Chinese speech recognition using DeepMind's WaveNet
A tensorflow implementation for Chinese speech recognition based on DeepMind's WaveNet: A Generative Model for Raw Audio. (Hereafter the Paper)
Version
Current Version : 0.0.1
Dependencies
python == 3.5
tensorflow == 1.0.0
librosa == 0.5.0
Dataset
清华30小时中文数据集
Directories
cache: save data featrue and word dictionary
data: wav files and related labels
model: save the models
Network model
Data random shuffle per epoch
Xavier initialization
Adam optimization algorithms
Batch Normalization
Train the network
python3 train.py
Test the network
python3 test.py
Other resources
TensorFlow练习15: 中文语音识别
ibab's WaveNet(speech synthesis) tensorflow implementationt
buriburisuri's WaveNet(English speech recognition) tensorflow and sugartensor implementationt