An evolution strategy with progressive episode lengths for playing games

authored by: Lior Fuks, Noor Awad, Frank Hutter, Marius Lindauer
Abstract: Recently, Evolution Strategies (ES) have been successfully applied to solve problems commonly addressed by reinforcement learning (RL). Due to the simplicity of ES approaches, their runtime is often dominated by the RL-task at hand (e.g., playing a game). In this work, we introduce Progressive Episode Lengths (PEL) as a new technique and incorporate it with ES. The main objective is to allow the agent to play short and easy tasks with limited lengths, and then use the gained knowledge to further solve long and hard tasks with progressive lengths. Hence allowing the agent to perform many function evaluations and find a good solution for short time horizons before adapting the strategy to tackle larger time horizons. We evaluated PEL on a subset of Atari games from OpenAI Gym, showing that it can substantially improve the optimization speed, stability and final score of canonical ES. Specifically, we show average improvements of 80% (32%) after 2 hours (10 hours) compared to canonical ES.
External Organisation(s): University of Freiburg
Type: Conference contribution
Pages: 1234-1240
No. of pages: 7
Publication date: 2019
Publication status: Published
Peer reviewed: Yes
ASJC Scopus subject areas: Artificial Intelligence
Electronic version(s): https://www.ijcai.org/Proceedings/2019/0172.pdf (Access: Unknown)
https://doi.org/10.24963/ijcai.2019/172 (Access: Closed)

BibTeX

@inproceedings{b2b9850b51e54130acd164e98e6f1ca8,
title = "An evolution strategy with progressive episode lengths for playing games",
abstract = "Recently, Evolution Strategies (ES) have been successfully applied to solve problems commonly addressed by reinforcement learning (RL). Due to the simplicity of ES approaches, their runtime is often dominated by the RL-task at hand (e.g., playing a game). In this work, we introduce Progressive Episode Lengths (PEL) as a new technique and incorporate it with ES. The main objective is to allow the agent to play short and easy tasks with limited lengths, and then use the gained knowledge to further solve long and hard tasks with progressive lengths. Hence allowing the agent to perform many function evaluations and find a good solution for short time horizons before adapting the strategy to tackle larger time horizons. We evaluated PEL on a subset of Atari games from OpenAI Gym, showing that it can substantially improve the optimization speed, stability and final score of canonical ES. Specifically, we show average improvements of 80% (32%) after 2 hours (10 hours) compared to canonical ES.",
author = "Lior Fuks and Noor Awad and Frank Hutter and Marius Lindauer",
note = "Funding information: Robert Bosch GmbH is acknowledged for financial support. The authors acknowledge support by the state of Baden-W{\"u}rrtemberg through bwHPC and the German Research Foundation (DFG) through grant no. INST 39/963-1 FUGG.; 28th International Joint Conference on Artificial Intelligence, IJCAI 2019 ; Conference date: 10-08-2019 Through 16-08-2019",
year = "2019",
doi = "10.24963/ijcai.2019/172",
language = "English",
series = "IJCAI International Joint Conference on Artificial Intelligence",
publisher = "AAAI Press/International Joint Conferences on Artificial Intelligence",
pages = "1234--1240",
editor = "Sarit Kraus",
booktitle = "Proceedings of the 28th International Joint Conference on Artificial Intelligence, IJCAI 2019",
}

Publication Details

An evolution strategy with progressive episode lengths for playing games

Funded by