Monte-Carlo Expectation Maximization for Decentralized POMDPs Feng Wu † Shlomo Zilberstein ‡ Nicholas R. Jennings †
Abstract
We address two signi?cant drawbacks of state-ofthe-art solvers of decentralized POMDPs (DECPOMDPs): the reliance on complete knowledge of the model and limited scalability as the complexity of the domain grows. We extend a recently proposed approach for solving DEC-POMDPs via a reduction to the maximum likelihood problem, which in turn can be solved using EM. We introduce a model-free version of this approach that employs Monte-Carlo EM (MCEM). While a na??ve implementation of MCEM is inadequate in multiagent settings, we introduce several improvements in sampling that produce high-quality results on a variety of DEC-POMDP benchmarks, including large problems with thousands of agents.