Theses & Projects

Student research projects and theses

We are continuously looking for students to work on student research projects and theses. Here, the topics cover the entire spectrum of research areas within the projects currently being dealt with by the staff. Ideas for own topics and concrete tasks are also welcome if they form synergies with the research areas.

Due to the constantly changing suggestions for topics and tasks by the supervisors, only a few selected topics are mentioned here.

Individual thesis topics

We are looking for students who would like to work on our future research projects. Depending on your skills and future plans, we will work together to find a topic in our research areas that best suits you. In your thesis, you will come into contact with emerging topics with a focus on AI. The following topics are examples of what such work might look like.

Suggested topics

  • Auto-PyTorch for X [BSc + MSc]

    Extend, apply and refine our AutoDL tool Auto-PyTorch for new areas such as outlier detection, maintenance prediction or time series prediction. We recommend a strong background in machine learning (especially Deep Learning) and their chosen application for this work. When applying, please indicate the direction you would like to work in and provide a rough plan of how you would implement AutoPytorch in your target area.

    Contact: Difan Deng

  • Implementation of a new DAC benchmark [BSc + MSc].

    Modelling, implementation and evaluation of DAC for any target algorithm. We recommend a strong background in RL, basic knowledge of DAC, and the target domain of your choice to be successful in this topic. Possible target domains include machine learning or reinforcement learning, MIPS or SAT solvers, and evolutionary algorithms.

    Contact: Theresa Eimer

  • Exploring Behavioral Similarity in Contextual Reinforcement Learning [Bsc + Msc]

    Scaling-up Reinforcement Learning to environments more complicated than games, such as chess, necessitates us to bias our learning methods to tackle many issues with such settings, such as large state and action spaces, complicated dynamics, etc. Behavioral similarity encompasses exploiting conditions that lead to similar policies, i.e., if provided with the same state, they produce identical action distributions. In this thesis, we want to study these methods in the contextual setting, wherein we will try to answer the following questions:

    1. Are similarity methods robust to small environmental changes, such as contextual Reinforcement Learning? 
    2. Can we improve these methods using this additional contextual information using loss augmentations or other procedures? 
    3. Can we meta-learn such metrics across environment distributions? 

    Ideally, for a bachelor thesis, this topic would entail ablating existing methods, while for master theses, we would take it a step further and try to develop a new approach.

  • Meta-Policy Gradients in Contextual RL [MSc]

    Meta-Policy gradients (MPGs) aim at learning a set of hyperparameters in a single lifetime. The key idea is to interleave policy iteration with hyperparameter updates, thus, learning an objective (Bellman or policy gradient objective) and then using it to learn a new parameter update.  Recent work has shown that contextual information about the environment (such as information about goals) can enrich meta-gradients.  

    Pitch: Study the impact of contextual information on vanilla and bootstrapped meta-gradients for generalization to similar environments. This work entails two questions:

    1. Do standard MPG techniques work well in contextual setting?
    2. Does incorporating contextual information help with learning better hyperparameter schedules using MPGs in such settings?

    Depending on your interest, we can scope out how to concretely answer these questions. 

  • Multi-fidelity as a meta-learning problem [MSc].

    Multi-Fidelity's primary goal is to achieve cost savings in the hyperparameter optimization process through approximated final performance, such as shorter training or training with less data. Since these approximations are typically gradual, it is advantageous to reduce the approximation error by selectively evaluating at multiple stages. The aim of this study is to transfer the perceived prior knowledge from one stage to the next and critically question which stage one should choose. The approach is to understand multi-fidelity as a continuous learning process across several similar tasks that need to be classified. We recommend a strong background in AutoML for this work.

    Contact: Tim Ruhkopf


  • Interpretable Hyperparameter Optimisation [BSc + MSc]

    Hyperparameter optimisation (HPO) methods can efficiently determine well-performing hyperparameter configurations. However, they often lack insight and transparency as they do not provide the user with an explanation of the optimisation process and the returned configuration. By extending HPO methods in various ways, we aim to approach a more interpretable HPO process. The goal of this work is to implement and evaluate new methods or extensions to existing methods for an interpretable HPO.

    Contact: Sarah Segel

  • Hyperparameter Importance for AutoML [MSc]

    Adjusting hyperparameters in machine learning is essential to achieve high performance. However, some hyperparameters are more important than others. An AutoML process could therefore benefit from integrating hyperparameter importance methods into the optimisation process so that important hyperparameters are changed more frequently. We recommend prior knowledge of interpretable machine learning and AutoML for this work.

    Contact: Sarah Segel

  • Interactive AutoML [MSc]

    Most AutoML procedures currently allow little or no user interaction. The consequence of this is that users often only discover at the end of an AutoML run that the procedure delivers an unsatisfactory result, although this might have been foreseeable considerably earlier in the optimisation process. The confidence in AutoML procedures on the part of many end users is correspondingly low. The aim of this thesis is to increase the interaction possibilities of AutoML procedures in various ways.

    Contact: Alexander Tornede

  • AutoML x LLMs [BSc + MSc]

    The fields of both Natural Language Processing (NLP) and Automated Machine Learning (AutoML) have achieved remarkable results over the past years. In NLP, especially Large Language Models (LLMs) have experienced a rapid series of breakthroughs very recently. We envision that the two fields can radically push the boundaries of each other through tight integration. The idea of this thesis is to investigate one or multiple ways how LLMs can be integrated with AutoML tools, in particular with SMAC.

    Contact: Alexander Tornede

  • Augmenting algorithm components in RL through meta-learning [MSc]

    We can generate augmentation functions by meta-learning, something for the policy objective in PPO. However, it is open whether this is generally true for algorithm components in reinforcement learning, whether we could also learn augmentation ensembles, and how well these functions generalise. The goal of this work is to extend existing techniques to new algorithms and components.

    Contact: Theresa Eimer

  • All about Bayesian Optimisation - How to increase robustness and efficiency? [BSc + MSc]

    Bayesian optimisation (BO) comprises a class of model-based, efficient algorithms for black-box optimisation with very small budgets of function evaluations. Depending on the problem to be optimised, a different setting is needed, which directly affects the robustness and efficiency.
    The goal of this work is to increase the robustness and efficiency of BO by either tuning or meta-learning the trade-off between exploration and exploitation (dynamically), or by extending our tool SMAC3 with different state-of-the-art methods.

    Contact: Carolin Benjamins

  • Improve our HPO Tool SMAC [BSc]

    SMAC is a robust and flexible framework for Bayesian optimisation. It supports users in finding well-suited hyperparameter configurations for their algorithms, e.g. in machine learning. The core of SMAC consists of Bayesian optimisation combined with an "aggressive racing" mechanism that efficiently decides which configuration is more suitable.
    Of course, SMAC is not yet complete and we have some possible final thesis topics, e.g.:

    • Implementation and evaluation of state-of-the-art models that approximate the underlying function.
    • Implementation and evaluation of input and output warping techniques
    • Implementation and evaluation of multi-objective quality indicators

    and many more!

    Contact: Carolin Benjamins

  • Implementation and Evaluation of Parallel HPO Approaches [BSc]

    Hyperparameter optimisation (HPO) is an important step in the development of machine learning models, but it can be computationally intensive and time-consuming. Parallel HPO methods, such as PASHA (Progressive Asynchronous Successive Halving Algorithm), attempt to address this problem by parallelising the search process. This project involves the implementation and evaluation of PASHA or similar parallel HPO approaches. The evaluation should consider factors such as computational efficiency, quality of the resulting models and scalability. The project could also compare the performance of the parallel HPO approach with traditional sequential HPO methods.

    Contact: Helena Graf


  • Hyperparameter Importance Methods [MSc]

    Identifying the most important hyperparameters gives insights into the underlying model and tells which hyperparameters should be primarily focused when tuning. There are many hyperparameter importance methods available (Feature Importance, Ablation Analysis, Forward Selection, Local Parameter Importance, fANOVA), all of which a standard machine learning engineer may not know about. In this project, a detailed overview of all methods should be given. Furthermore, domains of specific methods and advantages and disadvantages should be considered.

    Contact: Helena Graf


  • Investigate the Energy Efficiency of Search Spaces Candidates for Green AutoML [BSc + MSc]

    Search spaces are the backbones of HPO and AutoML tools, i.e. they provide the set of all possible candidates. Therefore, the best solution can only be as good as the search space allows it to be. So far, standard Deep Learning leveraging multiplication operations was mainly researched, achieving great performance in terms of prediction quality and runtime. Using other kinds of neural networks could improve the energy efficiency drastically, leading to the decrease of the environmental impact of HPO / AutoML.

    Contact: Tanja Tornede


  • Extension of HPO Benchmark to Cover Environmental Impact [BSc]

    Benchmarking is a common evaluation strategy for HPO methods. Usually, their focus is on performance and runtime, neglecting the collection of information about the environmental impact of the benchmark. The goal of this topic is to extend an existing benchmark to gain further insights on the resource and energy consumption of HPO.

    Contact: Tanja Tornede


  • Empirical Evaluation of Upper Bounds of Performance Metrics [BSc/MSc]

    The development of ML models is a tedious and time consuming task. Overall, the performance of each model is unclear until it is actually evaluated. Therefore, a lot of resources might be wasted if the performance of the final model does not match the previously defined minimal requirements, e.g. if an accuracy of 90% is required but the best model is only able to achieve 70%. In such a case, either the model is not good enough and one has to invest more resources to develop a better one, or the quality of the data makes it impossible to achieve a better performance. To rule out the latter case, it would be great if there are some theoretical bounds for different performance metrics, which could be checked in advance given a specific dataset. The goal of this thesis is to empirically evaluate existing proposals of such bounds, to figure out if they are actually true under different assumptions.

    Contact: Tanja Tornede



The exact procedure of a thesis, together with a rough idea of what we expect from theses, is described here.

It is important to us that the appropriate background knowledge is available so that a thesis has a chance of a positive conclusion. In order to be able to assess this accordingly, we would ask you to send us the following points:

Proposed topic or topic area(s)
What previous knowledge is available? What ML-related courses have been taken for this?

  • A self-assessment from -- to ++ on the following topics:
  • Coding in Python
  • Coding with PyTorch
  • Ability to implement a Deep Learning paper
  • Ability to implement a reinforcement learning paper
  • Ability to understand and execute a foreign codebase

If you are generally interested in writing a thesis with us but have not decided on any of the above topics, please email with the above information.

If you are interested in a specific topic indicated above, please send an email directly to the contact person indicated in the topic. The email addresses can be found on the personal pages.