Picture for Catherine Rasgaitis

Catherine Rasgaitis

Are Language Models Sensitive to Morally Irrelevant Distractors?

Add code
Feb 10, 2026
Viaarxiv icon