资源算法NeuralGpu

NeuralGpu

2019-09-20 | |  59 |   0 |   0

NeuralGPU

Code for the Neural GPU model described in http://arxiv.org/abs/1511.08228. The extended version was described in https://arxiv.org/abs/1610.08613.

Requirements: * TensorFlow (see tensorflow.org for how to install)

The model can be trained on the following algorithmic tasks:

  • sort - Sort a symbol list

  • kvsort - Sort symbol keys in dictionary

  • id - Return the same symbol list

  • rev - Reverse a symbol list

  • rev2 - Reverse a symbol dictionary by key

  • incr - Add one to a symbol value

  • add - Long decimal addition

  • left - First symbol in list

  • right - Last symbol in list

  • left-shift - Left shift a symbol list

  • right-shift - Right shift a symbol list

  • bmul - Long binary multiplication

  • mul - Long decimal multiplication

  • dup - Duplicate a symbol list with padding

  • badd - Long binary addition

  • qadd - Long quaternary addition

  • search - Search for symbol key in dictionary

It can also be trained on the WMT English-French translation task:

  • wmt - WMT English-French translation (data will be downloaded)

The value range for symbols are defined by the vocab_size flag. In particular, the values are in the range vocab_size - 1. So if you set --vocab_size=16 (the default) then --problem=rev will be reversing lists of 15 symbols, and --problem=id will be identity on a list of up to 15 symbols.

To train the model on the binary multiplication task run:

python neural_gpu_trainer.py --problem=bmul

This trains the Extended Neural GPU, to train the original model run:

python neural_gpu_trainer.py --problem=bmul --beam_size=0

While training, interim / checkpoint model parameters will be written to /tmp/neural_gpu/.

Once the amount of error gets down to what you're comfortable with, hit Ctrl-C to stop the training process. The latest model parameters will be in /tmp/neural_gpu/neural_gpu.ckpt-<step> and used on any subsequent run.

To evaluate a trained model on how well it decodes run:

python neural_gpu_trainer.py --problem=bmul --mode=1

To interact with a model (experimental, see code) run:

python neural_gpu_trainer.py --problem=bmul --mode=2

To train on WMT data, set a larger --nmaps and --vocab_size and avoid curriculum:

python neural_gpu_trainer.py --problem=wmt --vocab_size=32768 --nmaps=256
  --vec_size=256 --curriculum_seq=1.0 --max_length=60 --data_dir ~/wmt

With less memory, try lower batch size, e.g. --batch_size=4. With more GPUs in your system, there will be a batch on every GPU so you can run larger models. For example, --batch_size=4 --num_gpus=4 --nmaps=512 --vec_size=512 will run a large model (512-size) on 4 GPUs, with effective batches of 4*4=16.

Maintained by Lukasz Kaiser (lukaszkaiser)

链接:https://github.com/tensorflow/models/tree/master/research/neural_gpu

上一篇:Namignizer

下一篇:NeuralProgrammer

用户评价
全部评价

热门资源

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • inferno-boilerplate

    This is a very basic boilerplate example for pe...