资源分类

DQN-tensorflow

2020-01-09 |

70 |

0 |

DQN-tensorflow

Human-Level Control through Deep Reinforcement Learning

Tensorflow implementation of Human-Level Control through Deep Reinforcement Learning.

图片.png

This implementation contains:

Deep Q-network and Q-learning
Experience replay memory

to reduce the correlations between consecutive updates

Network for Q-learning targets are fixed for intervals

to reduce the correlations between target and predicted Q-values

Requirements

Python 2.7 or Python 3.3+
gym
tqdm
SciPy or OpenCV2
TensorFlow 0.12.0

Usage

First, install prerequisites with:

$ pip install tqdm gym[all]

To train a model for Breakout:

$ python main.py --env_name=Breakout-v0 --is_train=True
$ python main.py --env_name=Breakout-v0 --is_train=True --display=True

To test and record the screen with gym:

$ python main.py --is_train=False
$ python main.py --is_train=False --display=True

Results

Result of training for 24 hours using GTX 980 ti.

Simple Results

Details of Breakout with model m2(red) for 30 hours using GTX 980 Ti.

图片.png

Details of Breakout with model m3(red) for 30 hours using GTX 980 Ti.

图片.png

Detailed Results

[1] Action-repeat (frame-skip) of 1, 2, and 4 without learning rate decay

图片.png

[2] Action-repeat (frame-skip) of 1, 2, and 4 with learning rate decay

图片.png

[1] & [2]

[3] Action-repeat of 4 for DQN (dark blue) Dueling DQN (dark green) DDQN (brown) Dueling DDQN (turquoise)

The current hyper parameters and gradient clipping are not implemented as it is in the paper.

图片.png

[4] Distributed action-repeat (frame-skip) of 1 without learning rate decay

图片.png

[5] Distributed action-repeat (frame-skip) of 4 without learning rate decay

图片.png

References

License

MIT License.

上一篇：C3D-Tensorflow-slim

下一篇：pytorch-dqn

用户评价

全部评价

还没有评论，说两句吧！

热门资源

DuReader_QANet_BiDAF

Machine Reading Comprehension on DuReader Usin...
ETD_cataloguing_a...

ETD catalouging project using allennlp
allennlp_extras

allennlp_extras Some utilities build on top of...
allennlp-dureader

An Apache 2.0 NLP research library, built on Py...
honk-honk-motherf...

honk-honk-motherfucker

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com