Picture for Xia Hu

Xia Hu

A Trajectory-Based Safety Audit of Clawdbot (OpenClaw)

Add code
Feb 16, 2026
Viaarxiv icon

DeepSight: An All-in-One LM Safety Toolkit

Add code
Feb 12, 2026
Viaarxiv icon

Decoupled Reasoning with Implicit Fact Tokens (DRIFT): A Dual-Model Framework for Efficient Long-Context Inference

Add code
Feb 10, 2026
Viaarxiv icon

RAPO: Risk-Aware Preference Optimization for Generalizable Safe Reasoning

Add code
Feb 04, 2026
Viaarxiv icon

LPS-Bench: Benchmarking Safety Awareness of Computer-Use Agents in Long-Horizon Planning under Benign and Adversarial Scenarios

Add code
Feb 03, 2026
Viaarxiv icon

Interpreting Emergent Extreme Events in Multi-Agent Systems

Add code
Jan 28, 2026
Viaarxiv icon

AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security

Add code
Jan 26, 2026
Viaarxiv icon

The Why Behind the Action: Unveiling Internal Drivers via Agentic Attribution

Add code
Jan 21, 2026
Viaarxiv icon

OpenRT: An Open-Source Red Teaming Framework for Multimodal LLMs

Add code
Jan 04, 2026
Viaarxiv icon

BLASST: Dynamic BLocked Attention Sparsity via Softmax Thresholding

Add code
Dec 12, 2025
Viaarxiv icon