资源算法kaldi-tf

kaldi-tf

2020-04-07 | |  68 |   0 |   0

kaldi-tf

A set of scripts for getting data out of Kaldi and into TensorFlow.

Pipeline

StepCode Location
1) Generate Kaldi phoneme-level alignments (*.ali) via GMMsKaldi source
2) Generate Kaldi nnet3 neural net example files (egs.*.ark) from alignmentsKaldi source
3) Convert binary Nnet3Egs ark files to text ark files via nnet3-copy-egs.ccKaldi source
4) Convert text ark file to csv via egs-to-csv.pythis repo
5) Convert csv to tfrecords via csv-to-tfrecords.pythis repo
6) Read tfrecords, train, and evaluate with train_and_eval.pythis repo

Modifying Kaldi egs

Unrelated to TensorFlow, but if you want to open Kaldi egs, make changes, and use those modified egs in training, follow this guide:

  1. convert egs.ark to text: $ nnet3-copy-egs ark:egs.1.ark ark,t:egs.1.ark.txt

  2. make your changes to new ark text file

  3. convert ark text file back to binary with new scp file: $ nnet3-copy-egs ark,t:egs.1.ark.txt ark,scp:egs.1.ark,egs.scp

  4. make changes to scp file paths, because they change depending on where you run the nnet3-copy-egs script!


上一篇:kaldi_tutorial

下一篇:kaldi-unsupervised

用户评价
全部评价

热门资源

  • DuReader_QANet_BiDAF

    Machine Reading Comprehension on DuReader Usin...

  • ETD_cataloguing_a...

    ETD catalouging project using allennlp

  • allennlp_extras

    allennlp_extras Some utilities build on top of...

  • allennlp-dureader

    An Apache 2.0 NLP research library, built on Py...

  • honk-honk-motherf...

    honk-honk-motherfucker