Picture for Dezhang Kong

Dezhang Kong

NeuRel-Attack: Neuron Relearning for Safety Disalignment in Large Language Models

Add code
Apr 29, 2025
Viaarxiv icon