资源算法DQN-tensorflow

DQN-tensorflow

2020-01-09 | |  32 |   0 |   0

Human-Level Control through Deep Reinforcement Learning

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning.

图片.png

This implementation contains:

  1. Deep Q-network and Q-learning

  2. Experience replay memory

    • to reduce the correlations between consecutive updates

  3. Network for Q-learning targets are fixed for intervals

    • to reduce the correlations between target and predicted Q-values

Requirements

Usage

First, install prerequisites with:

$ pip install tqdm gym[all]

To train a model for Breakout:

$ python main.py --env_name=Breakout-v0 --is_train=True
$ python main.py --env_name=Breakout-v0 --is_train=True --display=True

To test and record the screen with gym:

$ python main.py --is_train=False
$ python main.py --is_train=False --display=True

Results

Result of training for 24 hours using GTX 980 ti.

best.gif

Simple Results

Details of Breakout with model m2(red) for 30 hours using GTX 980 Ti.

图片.png

Details of Breakout with model m3(red) for 30 hours using GTX 980 Ti.

图片.png

Detailed Results

[1] Action-repeat (frame-skip) of 1, 2, and 4 without learning rate decay

图片.png

[2] Action-repeat (frame-skip) of 1, 2, and 4 with learning rate decay

图片.png

[1] & [2]

[3] Action-repeat of 4 for DQN (dark blue) Dueling DQN (dark green) DDQN (brown) Dueling DDQN (turquoise)

The current hyper parameters and gradient clipping are not implemented as it is in the paper.

图片.png

[4] Distributed action-repeat (frame-skip) of 1 without learning rate decay

图片.png

[5] Distributed action-repeat (frame-skip) of 4 without learning rate decay

图片.png

References

License

MIT License.


上一篇:C3D-Tensorflow-slim

下一篇:pytorch-dqn

用户评价
全部评价

热门资源

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • shih-styletransfer

    shih-styletransfer Code from Style Transfer ...

  • inferno-boilerplate

    This is a very basic boilerplate example for pe...