Picture for Leon Eshuijs

Leon Eshuijs

But what is your honest answer? Aiding LLM-judges with honest alternatives using steering vectors

Add code
May 23, 2025
Viaarxiv icon

Short-circuiting Shortcuts: Mechanistic Investigation of Shortcuts in Text Classification

Add code
May 09, 2025
Viaarxiv icon

Balancing the Scales: Reinforcement Learning for Fair Classification

Add code
Jul 15, 2024
Viaarxiv icon