资源论文JELLY BEAN WORLD :A TESTBED FOR NEVER -E NDING LEARNING

JELLY BEAN WORLD :A TESTBED FOR NEVER -E NDING LEARNING

2020-01-02 | |  65 |   38 |   0

Abstract

Machine learning has shown growing success in recent years. However, current machine learning systems are highly specialized, trained for particular problems or domains, and typically on a single narrow dataset. Human learning, on the other hand, is highly general and adaptable. Never-ending learning is a machine learning paradigm that aims to bridge this gap, with the goal of encouraging researchers to design machine learning systems that can learn to perform a wider variety of inter-related tasks in more complex environments. To date, there is no environment or testbed to facilitate the development and evaluation of never-ending learning systems. To this end, we propose the Jelly Bean World testbed. The Jelly Bean World allows experimentation over two-dimensional grid worlds which are filled with items and in which agents can navigate. This testbed provides environments that are sufficiently complex and where more generally intelligent algorithms ought to perform better than current state-of-the-art reinforcement learning approaches. It does so by producing non-stationary environments and facilitating experimentation with multi-task, multi-agent, multi-modal, and curriculum learning settings. We hope that the Jelly Bean World will prompt new interest in the development of never-ending learning, and more broadly general intelligence.

上一篇:EXPECTED INFORMATION MAXIMIZATIONU SING THE I-P ROJECTION FOR MIXTURE DENSITYE STIMATION

下一篇:EXTREME CLASSIFICATION VIAA DVERSARIAL SOFTMAX APPROXIMATION

用户评价
全部评价

热门资源

  • Learning to Predi...

    Much of model-based reinforcement learning invo...

  • Stratified Strate...

    In this paper we introduce Stratified Strategy ...

  • The Variational S...

    Unlike traditional images which do not offer in...

  • A Mathematical Mo...

    Direct democracy, where each voter casts one vo...

  • Rating-Boosted La...

    The performance of a recommendation system reli...