资源论文The Expected-Length Model of Options

The Expected-Length Model of Options

2019-09-30 | |  75 |   37 |   0
Abstract Effective options can make reinforcement learning easier by enhancing an agent’s ability to both explore in a targeted manner and plan further into the future. However, learning an appropriate model of an option’s dynamics in hard, requiring estimating a highly parameterized probability distribution. This paper introduces and motivates the ExpectedLength Model (ELM) for options, an alternate model for transition dynamics. We prove ELM is a (biased) estimator of the traditional MultiTime Model (MTM), but provide a non-vacuous bound on their deviation. We further prove that, in stochastic shortest path problems, ELM induces a value function that is sufficiently similar to the one induced by MTM, and is thus capable of supporting near-optimal behavior. We explore the practical utility of this option model experimentally, finding consistent support for the thesis that ELM is a suitable replacement for MTM. In some cases, we find ELM leads to more sample efficient learning, especially when options are arranged in a hierarchy

上一篇:STG2Seq: Spatial-Temporal Graph to Sequence Model for Multi-step Passenger Demand Forecasting

下一篇:Travel Time Estimation without Road Networks: An Urban Morphological Layout Representation Approach

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...