transformer-tensorflow
TensorFlow implementation of Attention Is All You Need. (2017. 6)
Python 3.6
TensorFlow 1.8
hb-config (Singleton Config)
nltk (tokenizer and blue score)
tqdm (progress bar)
init Project by hb-base
. ├── config # Config files (.yml, .json) using with hb-config ├── data # dataset path ├── notebooks # Prototyping with numpy or tf.interactivesession ├── transformer # transformer architecture graphs (from input to logits) ├── __init__.py # Graph logic ├── attention.py # Attention (multi-head, scaled_dot_product and etc..) ├── encoder.py # Encoder logic ├── decoder.py # Decoder logic └── layer.py # Layers (FFN) ├── data_loader.py # raw_date -> precossed_data -> generate_batch (using Dataset) ├── hook.py # training or test hook feature (eg. print_variables) ├── main.py # define experiment_fn └── model.py # define EstimatorSpec
Reference : hb-config, Dataset, experiments_fn, EstimatorSpec
Train and evaluate with 'WMT German-English (2016)' dataset
Can control all Experimental environment.
example: check-tiny.yml
data: base_path: 'data/' raw_data_path: 'tiny_kor_eng' processed_path: 'tiny_processed_data' word_threshold: 1 PAD_ID: 0 UNK_ID: 1 START_ID: 2 EOS_ID: 3model: batch_size: 4 num_layers: 2 model_dim: 32 num_heads: 4 linear_key_dim: 20 linear_value_dim: 24 ffn_dim: 30 dropout: 0.2train: learning_rate: 0.0001 optimizer: 'Adam' ('Adagrad', 'Adam', 'Ftrl', 'Momentum', 'RMSProp', 'SGD') train_steps: 15000 model_dir: 'logs/check_tiny' save_checkpoints_steps: 1000 check_hook_n_iter: 100 min_eval_frequency: 100 print_verbose: True debug: False slack: webhook_url: "" # after training notify you using slack-webhook
debug mode : using tfdbg
check-tiny
is a data set with about 30 sentences that are translated from Korean into English. (recommend read it :) )
Install requirements.
pip install -r requirements.txt
Then, pre-process raw data.
python data_loader.py --config check-tiny
Finally, start train and evaluate model
python main.py --config check-tiny --mode train_and_evaluate
Or, you can use IWSLT'15 English-Vietnamese dataset.
sh prepare-iwslt15.en-vi.sh # download dataset python data_loader.py --config iwslt15-en-vi # preprocessing python main.py --config iwslt15-en-vi --mode train_and_evalueate # start training
After training, you can test the model.
command
python predict.py --config {config} --src {src_sentence}
example
$ python predict.py --config check-tiny --src "안녕하세요. 반갑습니다."------------------------------------ Source: 안녕하세요. 반갑습니다. > Result: Hello . I'm glad to see you . <s> vectors . <s> Hello locations . <s> will . <s> . <s> you . <s>
✅ : Working
◽️ : Not tested yet.
✅ evaluate
: Evaluate on the evaluation data.
◽️ extend_train_hooks
: Extends the hooks for training.
◽️ reset_export_strategies
: Resets the export strategies with the new_export_strategies.
◽️ run_std_server
: Starts a TensorFlow server and joins the serving thread.
◽️ test
: Tests training, evaluating and exporting the estimator for a single step.
✅ train
: Fit the estimator using the training data.
✅ train_and_evaluate
: Interleaves training and evaluation.
tensorboard --logdir logs
check-tiny example
Paper - Attention Is All You Need (2017. 6) by A Vaswani (Google Brain Team)
tensor2tensor - A library for generalized sequence to sequence models (official code)
Dongjun Lee (humanbrain.djlee@gmail.com)
上一篇: Speech-Transformer
还没有评论,说两句吧!
热门资源
seetafaceJNI
项目介绍 基于中科院seetaface2进行封装的JAVA...
spark-corenlp
This package wraps Stanford CoreNLP annotators ...
Keras-ResNeXt
Keras ResNeXt Implementation of ResNeXt models...
capsnet-with-caps...
CapsNet with capsule-wise convolution Project ...
shih-styletransfer
shih-styletransfer Code from Style Transfer ...
智能在线
400-630-6780
聆听.建议反馈
E-mail: support@tusaishared.com