资源论文Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces

Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces

2019-11-05 | |  57 |   46 |   0
Abstract Policy evaluation with linear function approximation is an important problem in reinforcement learning. When facing high-dimensional feature spaces, such a problem becomes extremely hard considering the computation efficiency and quality of approximations. We propose a new algorithm, LSTD(?)-RP, which leverages random projection techniques and takes eligibility traces into consideration to tackle the above two challenges. We carry out theoretical analysis of LSTD(?)-RP, and provide meaningful upper bounds of the estimation error, approximation error and total generalization error. These results demonstrate that LSTD(?)-RP can benefit from random projection and eligibility traces strategies, and LSTD(?)-RP can achieve better performances than prior LSTDRP and LSTD(?) algorithms.

上一篇:Optimization based Layer-wise Magnitude-based Pruning for DNN Compression?

下一篇:Variance Reduction in Black-box Variational Inference by Adaptive Importance Sampling

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...