Picture for Chen Xiong

Chen Xiong

Steering Externalities: Benign Activation Steering Unintentionally Increases Jailbreak Risk for Large Language Models

Add code
Feb 03, 2026
Viaarxiv icon

Hey, That's My Data! Label-Only Dataset Inference in Large Language Models

Add code
Jun 06, 2025
Viaarxiv icon