资源论文Efficient Methods for Multi-Objective Decision-Theoretic Planning

Efficient Methods for Multi-Objective Decision-Theoretic Planning

2019-11-21 | |  55 |   38 |   0

In decision-theoretic planning problems, such as (partially observable) Markov decision problems [Wiering and Van Otterlo, 2012] or coordination graphs [Guestrin et al., 2002], agents typically aim to optimize a scalar value function. However, in many real-world problems agents are faced with multiple possibly conflflicting objectives, e.g., maximizing the economic benefifits of timber harvesting while minimizing ecological damage in a forest management scenario [Bone and Dragicevic, 2009]. In such multi-objective problems, the value is a vector rather than a scalar [Roijers et al., 2013a]. Even when there are multiple objectives, it might not be necessary to have specialized multi-objective methods. When the problem can be scalarized, i.e., converted to a singleobjective problem before planning, existing single-objective methods may apply. Unfortunately, such a priori scalarization is not possible when the scalarization weights, i.e., the parameters of the scalarization, are not known in advance. For example, consider a company that mines different metals whose market prices vary. If there is not enough time to re-solve the decision problem for each price change, we need specialized multi-objective methods that compute a coverage set, i.e., a set of solutions optimal for all scalarizations. What constitutes a coverage set depends on the type scalarization. Much existing research assumes the Pareto coverage set (PCS), or Pareto front, as the optimal solution set. However, we argue that this is not always the best choice. In the highly prevalent case when the objectives will be linearly weighted, the convex coverage set (CCS) suffifices. Because CCSs are typically much smaller, and have exploitable mathematical properties, CCSs are often much cheaper to compute than PCSs. Futhermore, when policies can be stochastic, all optimal value-vectors can be attained by mixing policies from the CCS [Vamplew et al., 2009]. Thefore, this project focuses on fifinding planning methods that compute the CCS

上一篇:Abstract Argumentation Frameworks — From Theoretical Insights to Practical Implications

下一篇:Approximate Algorithms for Stochastic Network Design

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...