Real-Time Solving of Quanti?ed CSPs Based on Monte-Carlo Game Tree Search

资源分类

2019-11-12 |

36 |

37 |

Abstract We develop a real-time algorithm based on a Monte-Carlo game tree search for solving a quanti?ed constraint satisfaction problem (QCSP), which is a CSP where some variables are universally quanti?ed. A universally quanti?ed variable represents a choice of nature or an adversary. The goal of a QCSP is to make a robust plan against an adversary. However, obtaining a complete plan off-line is intractable when the size of the problem becomes large. Thus, we need to develop a realtime algorithm that sequentially selects a promising value at each deadline. Such a problem has been considered in the ?eld of game tree search. In a standard game tree search algorithm, developing a good static evaluation function is crucial. However, developing a good static evaluation function for a QCSP is very dif?cult since it must estimate the possibility that a partially assigned QCSP is solvable. Thus, we apply a Monte-Carlo game tree search technique called UCT. However, the simple application of the UCT algorithm does not work since the player and the adversary are asymmetric, i.e., ?nding a game sequence where the player wins is very rare. We overcome this dif?culty by introducing constraint propagation techniques. We experimentally compare the winning probability of our UCT-based algorithm and the state-of-the-art alpha-beta search algorithm. Our results show that our algorithm outperforms the state-of-the-art algorithm in large-scale problems.

上一篇：Nested Rollout Policy Adaptation for Monte Carlo Tree Search

下一篇：The Increasing Cost Tree Search for Optimal Multi-Agent Path?nding Guni Sharon Roni Stern Meir Goldenberg Ariel Felner

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
The Variational S...

Unlike traditional images which do not offer in...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com