Learning Heuristic Selection with Dynamic Algorithm Configuration

verfasst von: David Speck, André Biedenkapp, Frank Hutter, Robert Mattmüller, Marius Lindauer
Abstract: A key challenge in satisfying planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single heuristic can negatively affect the whole search. Since the performance of a heuristic varies from instance to instance, approaches such as algorithm selection can be successfully applied. In addition, alternating between multiple heuristics during the search makes it possible to use all heuristics equally and improve performance. However, all these approaches ignore the internal search dynamics of a planning system, which can help to select the most helpful heuristics for the current expansion step. We show that dynamic algorithm configuration can be used for dynamic heuristic selection which takes into account the internal search dynamics of a planning system. Furthermore, we prove that this approach generalizes over existing approaches and that it can exponentially improve the performance of the heuristic search. To learn dynamic heuristic selection, we propose an approach based on reinforcement learning and show empirically that domain-wise learned policies, which take the internal search dynamics of a planning system into account, can exceed existing approaches in terms of coverage.
Organisationseinheit(en): Fachgebiet Maschinelles Lernen
Institut für Informationsverarbeitung
Externe Organisation(en): Albert-Ludwigs-Universität Freiburg
Bosch Center for Artificial Intelligence (BCAI)
Typ: Aufsatz in Konferenzband
Publikationsdatum: 2021
Publikationsstatus: Elektronisch veröffentlicht (E-Pub)
Peer-reviewed: Ja
Elektronische Version(en): https://arxiv.org/abs/2006.08246 (Zugang: Offen)
https://doi.org/10.1609/icaps.v31i1.16008 (Zugang: Offen)

BibTeX

@inproceedings{6e0709733a724548be0b4ed0ea7d5aa2,
title = "Learning Heuristic Selection with Dynamic Algorithm Configuration",
abstract = " A key challenge in satisfying planning is to use multiple heuristics within one heuristic search. An aggregation of multiple heuristic estimates, for example by taking the maximum, has the disadvantage that bad estimates of a single heuristic can negatively affect the whole search. Since the performance of a heuristic varies from instance to instance, approaches such as algorithm selection can be successfully applied. In addition, alternating between multiple heuristics during the search makes it possible to use all heuristics equally and improve performance. However, all these approaches ignore the internal search dynamics of a planning system, which can help to select the most helpful heuristics for the current expansion step. We show that dynamic algorithm configuration can be used for dynamic heuristic selection which takes into account the internal search dynamics of a planning system. Furthermore, we prove that this approach generalizes over existing approaches and that it can exponentially improve the performance of the heuristic search. To learn dynamic heuristic selection, we propose an approach based on reinforcement learning and show empirically that domain-wise learned policies, which take the internal search dynamics of a planning system into account, can exceed existing approaches in terms of coverage. ",
keywords = "cs.AI, cs.LG",
author = "David Speck and Andr{\'e} Biedenkapp and Frank Hutter and Robert Mattm{\"u}ller and Marius Lindauer",
note = "Funding Information: D. Speck was supported by the German Research Founda-tion (DFG) as part of the project EPSDAC (MA 7790/1-1). M. Lindauer acknowledges support by the DFG underLI 2801/4-1. A. Biedenkapp, M. Lindauer and F. Hutter ac-knowledge funding by the Robert Bosch GmbH.; 31st International Conference on Automated Planning and Scheduling, ICAPS 2021 ; Conference date: 02-08-2021 Through 13-08-2021",
year = "2021",
doi = "10.1609/icaps.v31i1.16008",
language = "English",
booktitle = "Proceedings of the International Conference on Automated Planning and Scheduling (ICAPS)",
}

Details zu Publikationen

Learning Heuristic Selection with Dynamic Algorithm Configuration

Gefördert vom