资源算法pytorch-trpo

pytorch-trpo

2019-09-10 | |  84 |   0 |   0

PyTorch implementation of TRPO

This repo contains a PyTorch implementation of a Trust Region Policy Optimization agent for an environment with a discrete action space.

Environment Setup

  1. Install conda for Python 2.7.

2.

conda create --name trpo --file requirements/conda_requirements.txt
source activate trpo
pip install -r requirements/pip_requirements.txt
  1. Install PyTorch from source at commit eff5b8b.

Usage

python run_trpo.py --env=GYM_ENV_ID

where GYM_ENV_ID is the environment ID of the gym environment you which to train on.

Results

trpo_pong_gif

A game of Pong as played using the policy model learned from a TRPO agent

trpo_pong_png

Plot of total reward per episode of Pong vs. episode number

Related Repos

OpenAI's Baseline implementation of parallel TRPO in TensorFlow

Ilya Kostrikov's implementation of TRPO for continuous control in PyTorch


上一篇:bandit-nmt

下一篇:wasserstein-gan

用户评价
全部评价

热门资源

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • inferno-boilerplate

    This is a very basic boilerplate example for pe...