资源算法allennlp-distributed-training

allennlp-distributed-training

2020-04-08 | |  49 |   0 |   0

allennlp-distributed-training

This repo holds a few example AllenNLP experiments modified to run with DistributedDataParallel support. The training_config directory has two versions of the same set of experiments. The ones in distributed_data_parallel directory mostly differs with the dataset readers. The dataset readers are replicas of the original ones in AllenNLP, with a minor modification to support distributed sampling.

To run the distributed experiments install AllenNLP:

conda create -n allennlp_distributed python=3.7
conda activate allennlp_distributed
git clone https://github.com/allenai/allennlpcd allennlp
pip install .

And run:

allennlp train training_config/distributed_data_parallel/esim.jsonnet --include-package distributed-training -s output/

To run without distributed setup, do the usual AllenNLP installation and use experiments in training_config/ data_parallel/

Speed Comparison: Time taken to train one epoch (averaged over 3 epochs)

GPU - 2080 Ti

NOTE: The time reported does not correspond to the training_duration metric. This is the time taken by the Trainer._train_epoch method.

ExperimentSingle GPU2x Data Parallel2x Distributed4x Data Parallel4x Distributed
esim.jsonnet (400K SNLI samples)4m 15sNANA4m 30s2m 13s
bidaf.jsonnet5m 44sNANA4m 10s2m 5s


上一篇:allennlp-course-examples

下一篇: allennlp-probe-hw

用户评价
全部评价

热门资源

  • TensorFlow-Course

    This repository aims to provide simple and read...

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • mxnet_VanillaCNN

    This is a mxnet implementation of the Vanilla C...

  • DuReader_QANet_BiDAF

    Machine Reading Comprehension on DuReader Usin...

  • Klukshu-Sockeye-...

    KLUKSHU SOCKEYE PROJECTS 2016 This repositor...