Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces

资源分类

2019-11-05 |

57 |

46 |

Abstract Policy evaluation with linear function approximation is an important problem in reinforcement learning. When facing high-dimensional feature spaces, such a problem becomes extremely hard considering the computation efficiency and quality of approximations. We propose a new algorithm, LSTD(?)-RP, which leverages random projection techniques and takes eligibility traces into consideration to tackle the above two challenges. We carry out theoretical analysis of LSTD(?)-RP, and provide meaningful upper bounds of the estimation error, approximation error and total generalization error. These results demonstrate that LSTD(?)-RP can benefit from random projection and eligibility traces strategies, and LSTD(?)-RP can achieve better performances than prior LSTDRP and LSTD(?) algorithms.

上一篇：Optimization based Layer-wise Magnitude-based Pruning for DNN Compression?

下一篇：Variance Reduction in Black-box Variational Inference by Adaptive Importance Sampling

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com