Steering of LLMs

Large language models (LLMs) successfully tackle nearly any task prompted for with little to no fine-tuning. Still, LLMs tend to struggle when it comes to domain-specific expert skills, target audience adjustments, and similar. The NLP Group conducts basic research on how to steer LLMs toward specific behaviors using reinforcement learning, specialized instruction fine-tuning, activation-based steering, learning to prompt, and more.

Featured Publications

  • Ziegenbein et al. (2026). Timon Ziegenbein,  Maja Stahl, and Henning Wachsmuth. Teaching LLMs Human-Like Editing of Inappropriate Argumentation via Reinforcement Learning. ACL 2026, to appear.
  • Stahl et al. (2025). Maja Stahl, Timon Ziegenbein, Joonsuk Park, and Henning Wachsmuth. ArgInstruct: Specialized Instruction Fine-Tuning for Computational Argumentation. ACL Findings 2025.
  • Spliethöver et al. (2025). Maximilian Spliethöver, Tim Knebler, Fabian Fumagalli, Maximilian Muschalik, Barbara Hammer, Eyke Hüllermeier, and Henning Wachsmuth. Adaptive Prompting: Ad-hoc Prompt Composition for Social Bias Detection. NAACL 2025.
  • Ziegenbein et al. (2024). Timon Ziegenbein, Gabriella Skitalinskaya, Alireza Bayat Makou, and Henning Wachsmuth. LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback. ACL 2024.
  • Wachsmuth et al. (2024).  Henning Wachsmuth, Gabriella Lapesa, Elena Cabrio, Anne Lauscher, Joonsuk Park, Eva Maria Vecchi, Serena Villata, and Timon Ziegenbein. Argument Quality Assessment in the Age of Instruction-Following Large Language Models. COLING 2024.

Projects