Picture for Matteo Zavatteri

Matteo Zavatteri

Refusal Before Decoding: Detecting and Exploiting Refusal Signals in Intermediate LLM Activations

Add code
May 27, 2026
Viaarxiv icon