sklearn-deeprl

登录免费注册

论文
算法
数据集
经验分享
技术动态
行业动态

论文
学习
研究领域

算法
学习
研究领域

数据集
自动驾驶
图片

经验分享
学习
研究领域

技术动态
计算机视觉
自然语言处理

行业动态
教育
语音识别

》资源》算法》sklearn-deeprl

sklearn-deeprl

2020-02-13 |

|

44 |

0 |

0

0

sklearn-deeprl

Deep reinforcement learning. In scikit-learn. In less than 50 effective lines.

Dive-in button:

Currently both demos are vanilla crossentropy(CE) method for policy approximated by a neural network. For RL, it boild down to Repeat:

Generate N games
Take M best
Fit to those M best samples

The CE is a very general approach for approximate estimation and maximization tasks, you can read about it here. For reinforcement learning, we use the optimization version, basically trying to fit agent to generating games where reward is high. More on that here.

While this approach falls flat in some cases and it takes black magic to make it work with infinite MDPs or long session lengths, it still works unreasonably well in most cases. One more awesome trait is that it extendds effortlessly to policy approximation (e.g. deep RL), partially observable MDPs and all kinds of weird stuff you see in the wild.

If you want something heavier, take a look at agentnet.

上一篇：RoboND-DeepRL-Project

下一篇：deeprl_signal_control

用户评价

登录
注册

全部评价

还没有评论，说两句吧！

热门资源

TensorFlow-Course

This repository aims to provide simple and read...
seetafaceJNI

项目介绍基于中科院seetaface2进行封装的JAVA...
mxnet_VanillaCNN

This is a mxnet implementation of the Vanilla C...
DuReader_QANet_BiDAF

Machine Reading Comprehension on DuReader Usin...
Klukshu-Sockeye-...

KLUKSHU SOCKEYE PROJECTS 2016 This repositor...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com

关于我们
智享云简介联系我们隐私声明
服务与支持
使用帮助联系我们
快速链接
启迪智享官网
咨询电话：010-82353090

工作日早9:00-晚6:00

© 2009-2019 tusaishared.com.cn 版权所有京ICP备19018324号