Picture for Shouling Ji

Shouling Ji

Attributing and Exploiting Safety Vectors through Global Optimization in Large Language Models

Add code
Jan 22, 2026
Viaarxiv icon

RiskAtlas: Exposing Domain-Specific Risks in LLMs through Knowledge-Graph-Guided Harmful Prompt Generation

Add code
Jan 08, 2026
Viaarxiv icon

Bridging the Copyright Gap: Do Large Vision-Language Models Recognize and Respect Copyrighted Content?

Add code
Dec 26, 2025
Viaarxiv icon

The Eminence in Shadow: Exploiting Feature Boundary Ambiguity for Robust Backdoor Attacks

Add code
Dec 17, 2025
Viaarxiv icon

DP-CSGP: Differentially Private Stochastic Gradient Push with Compressed Communication

Add code
Dec 15, 2025
Viaarxiv icon

Synthetic Voices, Real Threats: Evaluating Large Text-to-Speech Models in Generating Harmful Audio

Add code
Nov 14, 2025
Viaarxiv icon

NeuroBreak: Unveil Internal Jailbreak Mechanisms in Large Language Models

Add code
Sep 04, 2025
Viaarxiv icon

TrojanTO: Action-Level Backdoor Attacks against Trajectory Optimization Models

Add code
Jun 15, 2025
Viaarxiv icon

TooBadRL: Trigger Optimization to Boost Effectiveness of Backdoor Attacks on Deep Reinforcement Learning

Add code
Jun 12, 2025
Viaarxiv icon

VModA: An Effective Framework for Adaptive NSFW Image Moderation

Add code
May 29, 2025
Figure 1 for VModA: An Effective Framework for Adaptive NSFW Image Moderation
Figure 2 for VModA: An Effective Framework for Adaptive NSFW Image Moderation
Figure 3 for VModA: An Effective Framework for Adaptive NSFW Image Moderation
Figure 4 for VModA: An Effective Framework for Adaptive NSFW Image Moderation
Viaarxiv icon