资源论文Gradient-based Hyperparameter Optimization through Reversible Learning

Gradient-based Hyperparameter Optimization through Reversible Learning

2020-03-04 | |  60 |   41 |   0

Abstract

Tuning hyperparameters of learning algorithms is hard because gradients are usually unavailable We compute exact gradients of cross-validation performance with respect to all hyperparameters by chaining derivatives backwards through the entire training procedure. These gradients allow us to optimize thousands of hyperparameters, including step-size and momentum schedules, weight initialization distributions, richly rameterized regularization schemes, and neural network architectures. We compute hyperparameter gradients by exactly reversing the dynamics of stochastic gradient descent with momentum.

上一篇:Learning Local Invariant Mahalanobis Distances

下一篇:CUR Algorithm for Partially Observed Matrices

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...