Picture for Sri Durga Sai Sowmya Kadali

Sri Durga Sai Sowmya Kadali

Jailbreaking Leaves a Trace: Understanding and Detecting Jailbreak Attacks from Internal Representations of Large Language Models

Add code
Feb 12, 2026
Viaarxiv icon