资源论文Advantage Amplification in Slowly Evolving Latent-State Environments

Advantage Amplification in Slowly Evolving Latent-State Environments

2019-10-09 | |  47 |   39 |   0
Abstract Latent-state environments with long horizons, such as those faced by recommender systems, pose significant challenges for reinforcement learning (RL). We identify and analyze several key hurdles for RL in such environments, including belief state error and small action advantage. We develop a general principle called advantage amplification that can overcome these hurdles through the use of temporal abstraction. We propose several aggregation methods and prove they induce amplification in certain settings. We also bound the loss in optimality incurred by our methods in environments where latent state evolves slowly and demonstrate their performance empirically in a stylized user-modeling task

上一篇:A Part Power Set Model for Scale-Free Person Retrieval

下一篇:Adversarial Imitation Learning from Incomplete Demonstrations

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Learning to learn...

    The move from hand-designed features to learned...