资源算法RNN-Transducer

RNN-Transducer

2020-01-17 | |  28 |   0 |   0

End-to-End Speech Recognition using RNN-Transducer

File description

  • eval.py: rnnt joint model decode

  • model.py: rnnt model, which contains acoustic / phoneme model

  • model2012.py: rnnt model refer to Graves2012

  • seq2seq/*: seq2seq with attention

  • rnnt_np.py: rnnt loss function implementation on mxnet, support for both symbol and gluon refer to PyTorch implementation

  • DataLoader.py: data process

  • train.py: rnnt training script, can be initialized from CTC and PM model

  • train_ctc.py: ctc training script

  • train_att.py: attention training script

Directory description

  • conf: kaldi feature extraction config

Reference Paper

Run

  • Compile RNNT Loss Follow the instructions in here to compile MXNET with RNNT loss.

  • Extract feature link kaldi timit example dirs (local steps utils ) excute run.sh to extract 40 dim fbank feature run feature_transform.sh to get 123 dim feature as described in Graves2013

  • Train RNNT model:

python train.py --lr 1e-3 --bi --dropout .5 --out exp/rnnt_bi_lr1e-3 --schedule

Evaluation

Default only for RNNT

  • Greedy decoding:

python eval.py <path to best model parameters> --bi
  • Beam search:

python eval.py <path to best model parameters> --bi --beam <beam size>

Results

  • CTC

    DecodePER
    greedy20.36
    beam 10020.03
  • Transducer

    DecodePER
    greedy20.74
    beam 4019.84

Requirements

  • Python 3.6

  • MxNet 1.1.0

  • numpy 1.14

TODO

  • beam serach accelaration

  • Seq2Seq with attention


上一篇:warp-transducer

下一篇: transducers.php

用户评价
全部评价

热门资源

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • shih-styletransfer

    shih-styletransfer Code from Style Transfer ...