Speedy Q-Learning

登录免费注册

论文
算法
数据集
经验分享
技术动态
行业动态

论文
学习
研究领域

算法
学习
研究领域

数据集
自动驾驶
图片

经验分享
学习
研究领域

技术动态
计算机视觉
自然语言处理

行业动态
教育
语音识别

》资源》论文》Speedy Q-Learning

Speedy Q-Learning

2020-01-08 |

|

77 |

36 |

0

0

Speedy Q-Learning
论文

Abstract

We introduce a new convergent variant of Q-learning, called speedy Q-learning (SQL), to address the problem of slow convergence in the standard form of the Q-learning algorithm. We prove a PAC bound on the performance of SQL, which n state-action pairs and the discount factor 图片.png only T = steps are required for the SQL algorithm to converge to an optimal action-value function with high probability. This bound has a better dependency on and thus, is tighter than the best available result for Q-learning. Our bound is also superior to the existing results for both modelfree and model-based instances of batch Q-value iteration that are considered to be more efficient than the incremental methods like Q-learning.

上一篇：Query-Aware MCMC

下一篇：Fast and Accurate k-llleans For Large Datasets

用户评价

登录
注册

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
Learning to learn...

The move from hand-designed features to learned...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com

关于我们
智享云简介联系我们隐私声明
服务与支持
使用帮助联系我们
快速链接
启迪智享官网
咨询电话：010-82353090

工作日早9:00-晚6:00

© 2009-2019 tusaishared.com.cn 版权所有京ICP备19018324号