Publication Details

Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges

authored by
Bernd Bischl, Martin Binder, Michel Lang, Tobias Pielok, Jakob Richter, Stefan Coors, Janek Thomas, Theresa Ullmann, Marc Becker, Anne-Laure Boulesteix, Difan Deng, Marius Lindauer
Abstract

Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods, e.g., based on resampling error estimation for supervised machine learning, can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing. It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization. This work is accompanied by an appendix that contains information on specific software packages in R and Python, as well as information and recommended hyperparameter search spaces for specific learning algorithms. We also provide notebooks that demonstrate concepts from this work as supplementary files.

Organisation(s)
Institute of Artificial Intelligence
External Organisation(s)
Ludwig-Maximilians-Universität München (LMU)
TU Dortmund University
Type
Article
Journal
Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
Volume
13
No. of pages
70
ISSN
1942-4787
Publication date
10.03.2023
Publication status
Published
Peer reviewed
Yes
ASJC Scopus subject areas
Computer Science(all)
Electronic version(s)
http://10.48550/arXiv.2107.05847 (Access: Open)
https://doi.org/10.1002/widm.1484 (Access: Open)