Seq2Seq in PyTorch
This is a complete suite for training sequence-to-sequence models in PyTorch. It consists of several models and code to both train and infer using them.
Using this code you can train: * Neural-machine-translation (NMT) models * Language models * Image to caption generation * Skip-thought sentence representations * And more...
Models
Models currently available: * Simple Seq2Seq recurrent model * Recurrent Seq2Seq with attentional decoder * Google neural machine translation (GNMT) recurrent model * Transformer - attention-only model from "Attention Is All You Need" * ByteNet - convolution based encoder+decoder
Datasets
Datasets currently available:
WMT16
OpenSubtitles 2016
COCO image captions
All datasets can be tokenized using 3 available segmentation methods:
After choosing a tokenization method, a vocabulary will be generated and saved for future inference.
Training methods
The models can be trained using several methods:
Basic Seq2Seq - given encoded sequence, generate (decode) output sequence. Training is done with teacher-forcing.
Multi Seq2Seq - where several tasks (such as multiple languages) are trained simultaneously by using the data sequences as both input to the encoder and output for decoder.
Image2Seq - used to train image to caption generators.
Usage
Example training scripts are available in scripts
folder. Inference examples are available in examples
folder.