资源论文Causal Bandits: Learning Good Interventions via Causal Inference

Causal Bandits: Learning Good Interventions via Causal Inference

2020-02-05 | |  43 |   31 |   0

Abstract 

We study the problem of using causal models to improve the rate at which good interventions can be learned online in a stochastic environment. Our formalism combines multi-arm bandits and causal inference to model a novel type of bandit feedback that is not exploited by existing approaches. We propose a new algorithm that exploits the causal feedback and prove a bound on its simple regret that is strictly better (in all quantities) than algorithms that do not use the additional causal information.

上一篇:Finite Sample Prediction and Recovery Bounds for Ordinal Embedding

下一篇:A Credit Assignment Compiler for Joint Prediction

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • Learning to learn...

    The move from hand-designed features to learned...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...