DSing ASR task: Resources and Baseline for an unaccompanied singing ASR.

In this repository, you will find the scripts used to construct the DSing ASR-oriented dataset and the baseline system constructed on Kaldi.

Cite:

@inproceedings{Roa_Dabike-Barker_2019,  
  author = {Roa Dabike, Gerardo and Barker, Jon}  
  title = {{Automatic Lyric Transcription from Karaoke Vocal Tracks: Resources and a Baseline System}},  
  year = 2019,  
  booktitle = {Proceedings of the 20th Annual Conference of the International Speech Communication Association (INTERSPEECH 2019)}  
}

1- DSing dataset

DSing is an ASR-oriented dataset constructed from the Smule Sing!300x30x2 dataset (Sing!). This repository provides the scripts to transform Sing! to the DSing ASR task.

2- Initial steps

The first step before running any of the scripts is to obtain access to Sing! dataset. For more details, go to DAMP repository.

3- Transform Sing! to DSing dataset

The scripts to transform the Sing! dataset to DSing ASR task dataset is located in the [DSing Construction](DSing Construction/) directory. The process is based on a series of python tools that are summarised in the runme_sing2dsing.sh bash script.

Define the variable version with the name of the DSing version you want to construct (DSing1, DSing3 or DSing30). Any other option will raise an error.
Set the variable DSing_dest with the path where the DSing version will be saved.
Set the variable SmuleSing_path with the path to your copy of Smule Sing!300x30x2.
Run code until step K
.....

4- Extract DSing dataset using pre-segmented data.

If you want to do some analysis in the segmentation results or to use DSing for different porpoise than ASR. In directory [DSing preconstructed](DSing preconstructed) you can find a small script that allows recovering the transcriptions and utterance wav files. Just need to to set the output directory and the path of your version of Sing!

上一篇：kaldi-hugo-search-template

下一篇：kaldi-lattice-word-index

用户评价

全部评价

还没有评论，说两句吧！

热门资源

TensorFlow-Course

This repository aims to provide simple and read...
seetafaceJNI

项目介绍基于中科院seetaface2进行封装的JAVA...
mxnet_VanillaCNN

This is a mxnet implementation of the Vanilla C...
vsepp_tensorflow

Improving Visual-Semantic Embeddings with Hard ...
DuReader_QANet_BiDAF

Machine Reading Comprehension on DuReader Usin...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com