Picture for Ningyu Zhang

Ningyu Zhang

A Comprehensive Survey in LLM(-Agent) Full Stack Safety: Data, Training and Deployment

Add code
Apr 22, 2025
Viaarxiv icon

EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models

Add code
Apr 21, 2025
Figure 1 for EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models
Figure 2 for EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models
Figure 3 for EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models
Figure 4 for EasyEdit2: An Easy-to-use Steering Framework for Editing Large Language Models
Viaarxiv icon

SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement

Add code
Apr 04, 2025
Viaarxiv icon

Agentic Knowledgeable Self-awareness

Add code
Apr 04, 2025
Viaarxiv icon

ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging

Add code
Mar 27, 2025
Figure 1 for ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging
Figure 2 for ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging
Figure 3 for ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging
Figure 4 for ZJUKLAB at SemEval-2025 Task 4: Unlearning via Model Merging
Viaarxiv icon

ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems

Add code
Mar 26, 2025
Figure 1 for ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems
Figure 2 for ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems
Figure 3 for ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems
Figure 4 for ADS-Edit: A Multimodal Knowledge Editing Dataset for Autonomous Driving Systems
Viaarxiv icon

LookAhead Tuning: Safer Language Models via Partial Answer Previews

Add code
Mar 24, 2025
Figure 1 for LookAhead Tuning: Safer Language Models via Partial Answer Previews
Figure 2 for LookAhead Tuning: Safer Language Models via Partial Answer Previews
Figure 3 for LookAhead Tuning: Safer Language Models via Partial Answer Previews
Figure 4 for LookAhead Tuning: Safer Language Models via Partial Answer Previews
Viaarxiv icon

CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners

Add code
Mar 20, 2025
Figure 1 for CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners
Figure 2 for CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners
Figure 3 for CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners
Figure 4 for CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners
Viaarxiv icon

BiasEdit: Debiasing Stereotyped Language Models via Model Editing

Add code
Mar 11, 2025
Figure 1 for BiasEdit: Debiasing Stereotyped Language Models via Model Editing
Figure 2 for BiasEdit: Debiasing Stereotyped Language Models via Model Editing
Figure 3 for BiasEdit: Debiasing Stereotyped Language Models via Model Editing
Figure 4 for BiasEdit: Debiasing Stereotyped Language Models via Model Editing
Viaarxiv icon

LightThinker: Thinking Step-by-Step Compression

Add code
Feb 21, 2025
Viaarxiv icon