Abstract
The sample inefficiency of standard deep reinforcement learning methods precludes their application
to many real-world problems. Methods which
leverage human demonstrations require fewer samples but have been researched less. As demonstrated in the computer vision and natural language
processing communities, large-scale datasets have
the capacity to facilitate research by serving as an
experimental and benchmarking platform for new
methods. However, existing datasets compatible
with reinforcement learning simulators do not have
sufficient scale, structure, and quality to enable the
further development and evaluation of methods focused on using human examples. Therefore, we
introduce a comprehensive, large-scale, simulatorpaired dataset of human demonstrations: MineRL.
The dataset consists of over 60 million automatically annotated state-action pairs across a variety
of related tasks in Minecraft, a dynamic, 3D, openworld environment. We present a novel data collection scheme which allows for the ongoing introduction of new tasks and the gathering of complete
state information suitable for a variety of methods. We demonstrate the hierarchality, diversity,
and scale of the MineRL dataset. Further, we show
the difficulty of the Minecraft domain along with
the potential of MineRL in developing techniques
to solve key research challenges within it