Abstract
Recently, Evolution Strategies (ES) have been successfully applied to solve problems commonly addressed by reinforcement learning (RL). Due to the
simplicity of ES approaches, their runtime is often dominated by the RL-task at hand (e.g., playing
a game). In this work, we introduce Progressive
Episode Lengths (PEL) as a new technique and incorporate it with ES. The main objective is to allow the agent to play short and easy tasks with limited lengths, and then use the gained knowledge to
further solve long and hard tasks with progressive
lengths. Hence allowing the agent to perform many
function evaluations and find a good solution for
short time horizons before adapting the strategy to
tackle larger time horizons. We evaluated PEL on
a subset of Atari games from OpenAI Gym, showing that it can substantially improve the optimization speed, stability and final score of canonical
ES. Specifically, we show average improvements
of 80% (32%) after 2 hours (10 hours) compared
to canonical ES