The department warmly congratulates Simon Münker (Trier University) and Fabio Sartori (Karlsruhe Institute of Technology) on receiving the Best Paper Award in the category “Information Technology, Social Justice, and Marginalized Contexts” Invited Track at the Hawaii International Conference on System Sciences.
Their award-winning paper, “Guardrail Vulnerabilities in Open-Source Language Models: Implications for Democratic Discourse and Marginalized Communities,” is available for reading and download here.
In their study, the authors examine whether safety mechanisms (“guardrails”) in leading open-source large language models effectively prevent the generation of harmful content. Evaluating seven widely used models—including Llama 3.1, Gemma 2, Mistral 7B, and Mixtral—they applied advanced adversarial prompting techniques and systematically analyzed the outputs using established hate-speech detection tools.