Efficient Adaptive Online Learning via Frequent Directions

资源分类

2019-11-05 |

88 |

51 |

Abstract By employing time-varying proximal functions, adaptive subgradient methods (A DAG RAD) have improved the regret bound and been widely used in online learning and optimization. However, A DA G RAD with full matrix proximal functions (A DA FULL ) cannot deal with large-scale problems due to the impractical time and space complexities, though it has better performance when gradients are correlated. In this paper, we propose A DA FD, an efficient variant of A DA FULL based on a deterministic matrix sketching technique called frequent directions. Following A DA FULL, we incorporate our A DA FD into both primal-dual subgradient method and composite mirror descent method to develop two efficient methods. By maintaining and manipulating low-rank matrices, at each iteration, the space complexity is reduced from O(d2 ) to O(? d) and the time complexity is reduced from O(d3 ) to O(? 2 d), where d is the dimensionality of the data and ? d is the sketching size. Theoretical analysis reveals that the regret of our methods is close to that of A DA FULL as long as the outer product matrix of gradients is approximately lowrank. Experimental results show that our A DA FD is comparable to A DA FULL and outperforms other state-of-the-art algorithms in online convex optimization as well as in training convolutional neural networks (CNN).

上一篇：Differentiable Submodular Maximization

下一篇：Cascaded Low Rank and Sparse Representation on Grassmann Manifolds

用户评价

全部评价

还没有评论，说两句吧！

热门资源

The Variational S...

Unlike traditional images which do not offer in...
Learning to Predi...

Much of model-based reinforcement learning invo...
Stratified Strate...

In this paper we introduce Stratified Strategy ...
A Mathematical Mo...

Direct democracy, where each voter casts one vo...
Rating-Boosted La...

The performance of a recommendation system reli...

智能在线

400-630-6780
聆听.建议反馈

E-mail: support@tusaishared.com