Large-Scale Markov Decision Problems with KL Control Cost and its Application to Crowdsourcing

资源分类

2020-03-05 |

49 |

48 |

Abstract

We study average and total cost Markov decision problems with large state spaces. Since the computational and statistical cost of finding the opti mal policy scales with the size of the state space, we focus on searching for near-optimality in a low-dimensional family of policies. In particular, we show that for problems with a KullbackLeibler divergence cost function, we can recast policy optimization as a convex optimization and solve it approximately using a stochastic subgradient algorithm. This method scales in complexity with the family of policies but not the state space. We show that the performance of the resulting policy is close to the best in the lowdimensional family. We demonstrate the efficacy of our approach by optimizing a policy for budget allocation in crowd labeling, an important crowdsourcing application.

上一篇：Preference Completion: Large-scale Collaborative Ranking from Pairwise Comparisons

下一篇：Stay on path: PCA along graph paths

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com