Machine Learning for Online Algorithm Selection under Censored Feedback

authored by: Alexander Tornede, Viktor Bengs, Eyke Hüllermeier
Abstract: In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms. For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime. As the latter is known to exhibit a heavy-tail distribution, an algorithm is normally stopped when exceeding a predefined upper time limit. As a consequence, machine learning methods used to optimize an algorithm selection strategy in a data-driven manner need to deal with right-censored samples, a problem that has received little attention in the literature so far. In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem. Moreover, we adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon. In an extensive experimental evaluation on an adapted version of the ASlib benchmark, we demonstrate that theoretically well-founded methods based on Thompson sampling perform specifically strong and improve in comparison to existing methods.
External Organisation(s): Ludwig-Maximilians-Universität München (LMU)
Paderborn University
Type: Conference contribution
Pages: 10370-10380
No. of pages: 11
Publication date: 30.06.2022
Publication status: Published
Peer reviewed: Yes
Electronic version(s): https://ojs.aaai.org/index.php/AAAI/article/view/21279 (Access: Open)

BibTeX

@inproceedings{765f5bdef8c6499990fbd2232a794f1d,
title = "Machine Learning for Online Algorithm Selection under Censored Feedback",
abstract = "In online algorithm selection (OAS), instances of an algorithmic problem class are presented to an agent one after another, and the agent has to quickly select a presumably best algorithm from a fixed set of candidate algorithms. For decision problems such as satisfiability (SAT), quality typically refers to the algorithm's runtime. As the latter is known to exhibit a heavy-tail distribution, an algorithm is normally stopped when exceeding a predefined upper time limit. As a consequence, machine learning methods used to optimize an algorithm selection strategy in a data-driven manner need to deal with right-censored samples, a problem that has received little attention in the literature so far. In this work, we revisit multi-armed bandit algorithms for OAS and discuss their capability of dealing with the problem. Moreover, we adapt them towards runtime-oriented losses, allowing for partially censored data while keeping a space- and time-complexity independent of the time horizon. In an extensive experimental evaluation on an adapted version of the ASlib benchmark, we demonstrate that theoretically well-founded methods based on Thompson sampling perform specifically strong and improve in comparison to existing methods.",
author = "Alexander Tornede and Viktor Bengs and Eyke H{\"u}llermeier",
note = "Publisher Copyright: Copyright {\textcopyright} 2022, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.",
year = "2022",
month = jun,
day = "30",
language = "English",
pages = "10370--10380",
booktitle = "Proceedings of the 36th AAAI Conference on Artificial Intelligence",
}

Publication Details

Machine Learning for Online Algorithm Selection under Censored Feedback

Funded by