Online Symbolic Gradient-Based Optimization for Factored Action MDPs

资源分类

2019-11-22 |

40 |

50 |

Abstract This paper investigates online stochastic planning for problems with large factored state and action spaces. We introduce a novel algorithm that builds a symbolic representation capturing an approximation of the action-value Q-function in terms of action variables, and then performs gradient based search to select an action for the current state. The algorithm can be seen as a symbolic extension of Monte-Carlo search, induced by independence assumptions on state and action variables, and augmented with gradients to speed up the search. This avoids the space explosion typically faced by symbolic methods, and the dearth of samples faced by Monte-Carlo methods when the action space is large. An experimental evaluation on benchmark problems shows that the algorithm is competitive with state of the art across problem sizes and that it provides significant improvements for large factored action spaces.

上一篇：Improved Solvers for Bounded-Suboptimal Multi-Agent Path Finding

下一篇：9-STRIPS: Existential Quantification in Planning and Constraint Satisfaction

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
The Variational S...

Unlike traditional images which do not offer in...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com