Video Captioning
Dependencies
(Check out the coco-caption
and cider
projects into your working directory)
Data
Obtain the dataset you need:
Getting started
Generate metadata
run func_standalize_format
run func_preprocess_datainfo
run func_build_vocab
run func_create_sequencelabel
run func_convert_datainfo2cocofmt
run func_compute_ciderdf
# Pre-compute document frequency for CIDEr computation
run func_compute_evalscores
# Pre-compute evaluation scores (BLEU_4, CIDEr, METEOR, ROUGE_L) for each caption
run func_extract_video_features
# extract video features
Training
Please refer to the opts.py
file for the set of available train/test options
# Train XE model./train.sh 0 [GPUIDs]
# Train CST_GT_None/WXE model./train.sh 1 [GPUIDs]
# Train CST_MS_Greedy model (using greedy baseline)./train.sh 2 [GPUIDs]
# Train CST_MS_SCB model (using SCB baseline, where SCB is computed from GT captions)./train.sh 3 [GPUIDs]
#Train CST_MS_SCB(*) model (using SCB baseline, where SCB is computed from model sampled captions)./train.sh 4 [GPUIDs]
Testing
./test.sh 0 [GPUIDs]
Acknowledgements