Split Q Learning: Reinforcement Learning with Two-Stream Rewards

资源分类

2019-10-10 |

75 |

40 |

Abstract Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for a reinforcement learning problem, which extends the standard Q-learning approach to incorporate a twostream framework of reward processing with biases biologically associated with several neurological and psychiatric conditions, including Parkinson’s and Alzheimer’s diseases, attentiondeficit/hyperactivity disorder (ADHD), addiction, and chronic pain. For AI community, the development of agents that react differently to different types of rewards can enable us to understand a wide spectrum of multi-agent interactions in complex real-world socioeconomic systems. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions and user preferences in longterm recommendation systems.

上一篇：Safe and Sample-Efficient Reinforcement Learning Algorithms for Factored Environments

下一篇：Teaching AI Agents Ethical Values Using Reinforcement Learning and Policy Orchestration (Extended Abstract) ?

用户评价

全部评价

还没有评论，说两句吧！

热门资源

Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
The Variational S...

Unlike traditional images which do not offer in...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com