DSing ASR task: Resources and Baseline for an unaccompanied singing ASR.
In this repository, you will find the scripts used to construct the DSing ASR-oriented dataset and the baseline system constructed on Kaldi.
Cite:
@inproceedings{Roa_Dabike-Barker_2019,
author = {Roa Dabike, Gerardo and Barker, Jon}
title = {{Automatic Lyric Transcription from Karaoke Vocal Tracks: Resources and a Baseline System}},
year = 2019,
booktitle = {Proceedings of the 20th Annual Conference of the International Speech Communication Association (INTERSPEECH 2019)}
}
1- DSing dataset
DSing is an ASR-oriented dataset constructed from the Smule Sing!300x30x2 dataset (Sing!). This repository provides the scripts to transform Sing! to the DSing ASR task.
2- Initial steps
The first step before running any of the scripts is to obtain access to Sing! dataset. For more details, go to DAMP repository.
3- Transform Sing! to DSing dataset
The scripts to transform the Sing! dataset to DSing ASR task dataset is located in the [DSing Construction](DSing Construction/) directory. The process is based on a series of python tools that are summarised in the runme_sing2dsing.sh bash script.
Define the variable version with the name of the DSing version you want to construct (DSing1, DSing3 or DSing30). Any other option will raise an error.
Set the variable DSing_dest with the path where the DSing version will be saved.
Set the variable SmuleSing_path with the path to your copy of Smule Sing!300x30x2.
Run code until step K
.....
4- Extract DSing dataset using pre-segmented data.
If you want to do some analysis in the segmentation results or to use DSing for different porpoise than ASR. In directory [DSing preconstructed](DSing preconstructed) you can find a small script that allows recovering the transcriptions and utterance wav files. Just need to to set the output directory and the path of your version of Sing!