Picture for Eyal Lenga

Eyal Lenga

One Step to the Side: Why Defenses Against Malicious Finetuning Fail Under Adaptive Adversaries

Add code
May 14, 2026
Viaarxiv icon

GAVEL: Towards rule-based safety through activation monitoring

Add code
Jan 29, 2026
Viaarxiv icon