Picture for Geng Hong

Geng Hong

ReasoningGuard: Safeguarding Large Reasoning Models with Inference-time Safety Aha Moments

Add code
Aug 06, 2025
Viaarxiv icon

ReasoningShield: Content Safety Detection over Reasoning Traces of Large Reasoning Models

Add code
May 22, 2025
Viaarxiv icon

OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation

Add code
Apr 18, 2025
Viaarxiv icon