Picture for Xiangzheng Zhang

Xiangzheng Zhang

SafeHarbor: Hierarchical Memory-Augmented Guardrail for LLM Agent Safety

Add code
May 07, 2026
Viaarxiv icon

TrajShield: Trajectory-Level Safety Mediation for Defending Text-to-Video Models Against Jailbreak Attacks

Add code
May 03, 2026
Viaarxiv icon

Thinking with Reasoning Skills: Fewer Tokens, More Accuracy

Add code
Apr 23, 2026
Viaarxiv icon

VEPO: Variable Entropy Policy Optimization for Low-Resource Language Foundation Models

Add code
Mar 19, 2026
Viaarxiv icon

Beyond Parameter Arithmetic: Sparse Complementary Fusion for Distribution-Aware Model Merging

Add code
Feb 12, 2026
Viaarxiv icon

Beyond Static Alignment: Hierarchical Policy Control for LLM Safety via Risk-Aware Chain-of-Thought

Add code
Feb 06, 2026
Viaarxiv icon

TriPlay-RL: Tri-Role Self-Play Reinforcement Learning for LLM Safety Alignment

Add code
Jan 26, 2026
Viaarxiv icon

FABLE: Forest-Based Adaptive Bi-Path LLM-Enhanced Retrieval for Multi-Document Reasoning

Add code
Jan 26, 2026
Viaarxiv icon

Robust Privacy: Inference-Time Privacy through Certified Robustness

Add code
Jan 24, 2026
Viaarxiv icon

DIVER: Dynamic Iterative Visual Evidence Reasoning for Multimodal Fake News Detection

Add code
Jan 12, 2026
Viaarxiv icon