资源算法tacotron2

tacotron2

2019-09-12 | |  96 |   0 |   0

Tacotron 2 (without wavenet)

PyTorch implementation of Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions.

This implementation includes distributed and fp16 support and uses the LJSpeech dataset.

Distributed and FP16 support relies on work by Christian Sarofeen and NVIDIA's Apex Library.

Visit our [website] for audio samples using our published [Tacotron 2] and [WaveGlow] models.

Alignment, Predicted Mel Spectrogram, Target Mel Spectrogram

Pre-requisites

  1. NVIDIA GPU + CUDA cuDNN

Setup

  1. Download and extract the LJ Speech dataset

  2. Clone this repo: git clone https://github.com/NVIDIA/tacotron2.git

  3. CD into this repo: cd tacotron2

  4. Initialize submodule: git submodule init; git submodule update

  5. Update .wav paths: sed -i -- 's,DUMMY,ljs_dataset_folder/wavs,g' filelists/*.txt

    • Alternatively, set load_mel_from_disk=True in hparams.py and update mel-spectrogram paths

  6. Install [PyTorch 1.0]

  7. Install python requirements or build docker image

    • Install python requirements: pip install -r requirements.txt

Training

  1. python train.py --output_directory=outdir --log_directory=logdir

  2. (OPTIONAL) tensorboard --logdir=outdir/logdir

Multi-GPU (distributed) and FP16 Training

  1. python -m multiproc train.py --output_directory=outdir --log_directory=logdir --hparams=distributed_run=True,fp16_run=True

Inference demo

  1. Download our published [Tacotron 2] model

  2. Download our published [WaveGlow] model

  3. jupyter notebook --ip=127.0.0.1 --port=31337

  4. Load inference.ipynb

N.b. When performing Mel-Spectrogram to Audio synthesis, make sure Tacotron 2 and the Mel decoder were trained on the same mel-spectrogram representation.

Related repos

WaveGlow Faster than real time Flow-based Generative Network for Speech Synthesis

nv-wavenet Faster than real time WaveNet.

Acknowledgements

This implementation uses code from the following repos: Keith ItoPrem Seetharaman as described in our code.

We are inspired by Ryuchi Yamamoto's Tacotron PyTorch implementation.

We are thankful to the Tacotron 2 paper authors, specially Jonathan Shen, Yuxuan Wang and Zongheng Yang.

上一篇:Deep Feature Flow

下一篇:pytorch-openai-transformer-lm

用户评价
全部评价

热门资源

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • inferno-boilerplate

    This is a very basic boilerplate example for pe...