资源算法sklearn-deeprl

sklearn-deeprl

2020-02-13 | |  34 |   0 |   0

Deep reinforcement learning. In scikit-learn. In less than 50 effective lines.

Dive-in button: Binder

Currently both demos are vanilla crossentropy(CE) method for policy approximated by a neural network. For RL, it boild down to Repeat:

  • Generate N games

  • Take M best

  • Fit to those M best samples

The CE is a very general approach for approximate estimation and maximization tasks, you can read about it here. For reinforcement learning, we use the optimization version, basically trying to fit agent to generating games where reward is high. More on that here.

While this approach falls flat in some cases and it takes black magic to make it work with infinite MDPs or long session lengths, it still works unreasonably well in most cases. One more awesome trait is that it extendds effortlessly to policy approximation (e.g. deep RL), partially observable MDPs and all kinds of weird stuff you see in the wild.

If you want something heavier, take a look at agentnet.


上一篇:RoboND-DeepRL-Project

下一篇:deeprl_signal_control

用户评价
全部评价

热门资源

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • shih-styletransfer

    shih-styletransfer Code from Style Transfer ...