资源论文Causal Bandits with Propagating Inference

Causal Bandits with Propagating Inference

2020-03-16 | |  87 |   46 |   0

Abstract

Bandit is a framework for designing sequential experiments, where a learner selects an arm A 图片.png A and obtains an observation corresponding to A in each experiment. Theoretically, the tight regret lower-bound for the general bandit is polynomial with respect to the number of arms |A|, and thus, to overcome this bound, the bandit problem with side-information is often considered. Recently, a bandit framework over a causal graph was introduced, where the structure of the causal graph is available as side-information and the arms are identified with interventions on the causal graph. Existing p algorithms for causal bandit overcame the 图片.png simple-regret lower-bound; however, their algorithms work only when the interventions A are localized around a single node (i.e an intervention propagates only to its neighbors). We then propose a novel causal bandit algorithm for an arbitrary set of interventions, which can propagate throughout thep causal graph. We also show that it achieves 图片.png 图片.png is determined by using a causal graph structure. In particular, if the maxi mum in-degree of the causal graph is a constant, then 图片.png, where N is the number of nodes.

上一篇:Clustering Semi-Random Mixtures of Gaussians

下一篇:Kernel Recursive ABC: Point Estimation with Intractable Likelihood

用户评价
全部评价

热门资源

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...