Picture for Geng Hong

Geng Hong

Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search

Add code
Mar 25, 2026
Viaarxiv icon

MirrorGuard: Toward Secure Computer-Use Agents via Simulation-to-Real Reasoning Correction

Add code
Jan 19, 2026
Viaarxiv icon

WebTrap Park: An Automated Platform for Systematic Security Evaluation of Web Agents

Add code
Jan 13, 2026
Viaarxiv icon

When Bots Take the Bait: Exposing and Mitigating the Emerging Social Engineering Attack in Web Automation Agent

Add code
Jan 12, 2026
Viaarxiv icon

SmartSight: Mitigating Hallucination in Video-LLMs Without Compromising Video Understanding via Temporal Attention Collapse

Add code
Dec 21, 2025
Viaarxiv icon

ReasoningGuard: Safeguarding Large Reasoning Models with Inference-time Safety Aha Moments

Add code
Aug 06, 2025
Viaarxiv icon

ReasoningShield: Content Safety Detection over Reasoning Traces of Large Reasoning Models

Add code
May 22, 2025
Figure 1 for ReasoningShield: Content Safety Detection over Reasoning Traces of Large Reasoning Models
Figure 2 for ReasoningShield: Content Safety Detection over Reasoning Traces of Large Reasoning Models
Figure 3 for ReasoningShield: Content Safety Detection over Reasoning Traces of Large Reasoning Models
Figure 4 for ReasoningShield: Content Safety Detection over Reasoning Traces of Large Reasoning Models
Viaarxiv icon

OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation

Add code
Apr 18, 2025
Figure 1 for OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation
Figure 2 for OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation
Figure 3 for OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation
Figure 4 for OpenDeception: Benchmarking and Investigating AI Deceptive Behaviors via Open-ended Interaction Simulation
Viaarxiv icon