Picture for Handing Wang

Handing Wang

DTVI: Dual-Stage Textual and Visual Intervention for Safe Text-to-Image Generation

Add code
Mar 23, 2026
Viaarxiv icon

Multilingual Safety Alignment Via Sparse Weight Editing

Add code
Feb 26, 2026
Viaarxiv icon

SafeNeuron: Neuron-Level Safety Alignment for Large Language Models

Add code
Feb 12, 2026
Viaarxiv icon

Overlooked Safety Vulnerability in LLMs: Malicious Intelligent Optimization Algorithm Request and its Jailbreak

Add code
Jan 01, 2026
Viaarxiv icon

Solver-Independent Automated Problem Formulation via LLMs for High-Cost Simulation-Driven Design

Add code
Dec 21, 2025
Viaarxiv icon

From Parameter to Representation: A Closed-Form Approach for Controllable Model Merging

Add code
Nov 14, 2025
Figure 1 for From Parameter to Representation: A Closed-Form Approach for Controllable Model Merging
Figure 2 for From Parameter to Representation: A Closed-Form Approach for Controllable Model Merging
Figure 3 for From Parameter to Representation: A Closed-Form Approach for Controllable Model Merging
Figure 4 for From Parameter to Representation: A Closed-Form Approach for Controllable Model Merging
Viaarxiv icon

Implicit Jailbreak Attacks via Cross-Modal Information Concealment on Vision-Language Models

Add code
May 22, 2025
Figure 1 for Implicit Jailbreak Attacks via Cross-Modal Information Concealment on Vision-Language Models
Figure 2 for Implicit Jailbreak Attacks via Cross-Modal Information Concealment on Vision-Language Models
Figure 3 for Implicit Jailbreak Attacks via Cross-Modal Information Concealment on Vision-Language Models
Figure 4 for Implicit Jailbreak Attacks via Cross-Modal Information Concealment on Vision-Language Models
Viaarxiv icon

One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models

Add code
May 12, 2025
Figure 1 for One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models
Figure 2 for One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models
Figure 3 for One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models
Figure 4 for One Trigger Token Is Enough: A Defense Strategy for Balancing Safety and Usability in Large Language Models
Viaarxiv icon

ParetoHqD: Fast Offline Multiobjective Alignment of Large Language Models using Pareto High-quality Data

Add code
Apr 23, 2025
Figure 1 for ParetoHqD: Fast Offline Multiobjective Alignment of Large Language Models using Pareto High-quality Data
Figure 2 for ParetoHqD: Fast Offline Multiobjective Alignment of Large Language Models using Pareto High-quality Data
Figure 3 for ParetoHqD: Fast Offline Multiobjective Alignment of Large Language Models using Pareto High-quality Data
Figure 4 for ParetoHqD: Fast Offline Multiobjective Alignment of Large Language Models using Pareto High-quality Data
Viaarxiv icon

Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models

Add code
Apr 15, 2025
Figure 1 for Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models
Figure 2 for Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models
Figure 3 for Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models
Figure 4 for Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models
Viaarxiv icon