资源算法transformer-tensorflow

transformer-tensorflow

2020-01-10 | |  37 |   0 |   0

transformer hb-research

TensorFlow implementation of Attention Is All You Need. (2017. 6)

transformer-architecture.png

Requirements

Project Structure

init Project by hb-base

.
├── config                  # Config files (.yml, .json) using with hb-config
├── data                    # dataset path
├── notebooks               # Prototyping with numpy or tf.interactivesession
├── transformer             # transformer architecture graphs (from input to logits)
    ├── __init__.py             # Graph logic
    ├── attention.py            # Attention (multi-head, scaled_dot_product and etc..)
    ├── encoder.py              # Encoder logic
    ├── decoder.py              # Decoder logic
    └── layer.py                # Layers (FFN)
├── data_loader.py          # raw_date -> precossed_data -> generate_batch (using Dataset)
├── hook.py                 # training or test hook feature (eg. print_variables)
├── main.py                 # define experiment_fn
└── model.py                # define EstimatorSpec

Reference : hb-config, Dataset, experiments_fn, EstimatorSpec

Todo

  • Train and evaluate with 'WMT German-English (2016)' dataset

Config

Can control all Experimental environment.

example: check-tiny.yml

data:  base_path: 'data/'
  raw_data_path: 'tiny_kor_eng'
  processed_path: 'tiny_processed_data'
  word_threshold: 1

  PAD_ID: 0
  UNK_ID: 1
  START_ID: 2
  EOS_ID: 3model:  batch_size: 4
  num_layers: 2
  model_dim: 32
  num_heads: 4
  linear_key_dim: 20
  linear_value_dim: 24
  ffn_dim: 30
  dropout: 0.2train:  learning_rate: 0.0001
  optimizer: 'Adam'  ('Adagrad', 'Adam', 'Ftrl', 'Momentum', 'RMSProp', 'SGD')
  
  train_steps: 15000
  model_dir: 'logs/check_tiny'
  
  save_checkpoints_steps: 1000
  check_hook_n_iter: 100
  min_eval_frequency: 100
  
  print_verbose: True
  debug: False
  slack:  webhook_url: ""  # after training notify you using slack-webhook
  • debug mode : using tfdbg

  • check-tiny is a data set with about 30 sentences that are translated from Korean into English. (recommend read it :) )

Usage

Install requirements.

pip install -r requirements.txt

Then, pre-process raw data.

python data_loader.py --config check-tiny

Finally, start train and evaluate model

python main.py --config check-tiny --mode train_and_evaluate

Or, you can use IWSLT'15 English-Vietnamese dataset.

sh prepare-iwslt15.en-vi.sh                                        # download dataset
python data_loader.py --config iwslt15-en-vi                       # preprocessing
python main.py --config iwslt15-en-vi --mode train_and_evalueate   # start training

Predict

After training, you can test the model.

  • command

python predict.py --config {config} --src {src_sentence}
  • example

$ python predict.py --config check-tiny --src "안녕하세요. 반갑습니다."------------------------------------
Source: 안녕하세요. 반갑습니다. > Result: Hello . I'm glad to see you . <s> vectors . <s> Hello locations . <s> will . <s> . <s> you . <s>

Experiments modes

✅ : Working
◽️ : Not tested yet.

  • evaluate : Evaluate on the evaluation data.

  • ◽️ extend_train_hooks :  Extends the hooks for training.

  • ◽️ reset_export_strategies : Resets the export strategies with the new_export_strategies.

  • ◽️ run_std_server : Starts a TensorFlow server and joins the serving thread.

  • ◽️ test : Tests training, evaluating and exporting the estimator for a single step.

  • train : Fit the estimator using the training data.

  • train_and_evaluate : Interleaves training and evaluation.


Tensorboar

tensorboard --logdir logs

  • check-tiny example

check_tiny_tensorboard.png

Reference

Author

Dongjun Lee (humanbrain.djlee@gmail.com)


上一篇: Speech-Transformer

下一篇:keras-transformer

用户评价
全部评价

热门资源

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • shih-styletransfer

    shih-styletransfer Code from Style Transfer ...