资源算法 Tacotron2-Khmer

Tacotron2-Khmer

2020-04-02 | |  32 |   0 |   0

Tacotron 2

A PyTorch implementation of Tacotron2, described in Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions, an end-to-end text-to-speech(TTS) neural network architecture, which directly converts character text sequence to speech.

Dataset

Aishell Dataset, containing 400 speakers and over 170 hours of Mandarin speech data.

Dependency

  • Python 3.5.2

  • PyTorch 1.0.0

Usage

Data Pre-processing

Extract data_aishell.tgz:

$ python extract.py

Extract wav files into train/dev/test folders:

$ cd data/data_aishell/wav
$ find . -name '*.tar.gz' -execdir tar -xzvf '{}' ;

Scan transcript data, generate features:

$ python pre_process.py

Now the folder structure under data folder is sth. like:

data/
    data_aishell.tgz
    data_aishell/
        transcript/
            aishell_transcript_v0.8.txt
        wav/
            train/
            dev/
            test/
    aishell.pickle

Train

$ python train.py

If you want to visualize during training, run in your terminal:

$ tensorboard --logdir runs

Demo

Generate mel-spectrogram for text "Waveglow is really awesome!"

$ python demo.py

图片.png

上一篇:Tacotron2.jl

下一篇:Tacotron2-rehearsal

用户评价
全部评价

热门资源

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • shih-styletransfer

    shih-styletransfer Code from Style Transfer ...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...