资源论文Scalar Posterior Sampling with Applications

Scalar Posterior Sampling with Applications

2020-02-17 | |  61 |   38 |   0

Abstract 

We propose a practical non-episodic PSRL algorithm that unlike recent state-of-theart PSRL algorithms uses a deterministic, model-independent episode switching schedule. Our algorithm termed deterministic schedule PSRL (DS-PSRL) is efficient in terms of time, sample, and space complexity. We prove a Bayesian regret bound under mild assumptions. Our result is more generally applicable to multiple parameters and continuous state action problems. We compare our algorithm with state-of-the-art PSRL algorithms on standard discrete and continuous problems from the literature. Finally, we show how the assumptions of our algorithm satisfy a sensible parametrization for a large class of problems in sequential recommendations.

上一篇:Deep Anomaly Detection Using Geometric Transformations

下一篇:DAGs with NO TEARS: Continuous Optimization for Structure Learning

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...