Picture for Xiaogeng Liu

Xiaogeng Liu

DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agents

Add code
May 06, 2026
Viaarxiv icon

Mind Your HEARTBEAT! Claw Background Execution Inherently Enables Silent Memory Pollution

Add code
Mar 25, 2026
Viaarxiv icon

ROM: Real-time Overthinking Mitigation via Streaming Detection and Intervention

Add code
Mar 23, 2026
Viaarxiv icon

ReasoningBomb: A Stealthy Denial-of-Service Attack by Inducing Pathologically Long Reasoning in Large Reasoning Models

Add code
Jan 29, 2026
Viaarxiv icon

MetaAgent: Automatically Constructing Multi-Agent Systems Based on Finite State Machines

Add code
Jul 30, 2025
Viaarxiv icon

DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents

Add code
Jun 13, 2025
Viaarxiv icon

OET: Optimization-based prompt injection Evaluation Toolkit

Add code
May 01, 2025
Viaarxiv icon

Doxing via the Lens: Revealing Privacy Leakage in Image Geolocation for Agentic Multi-Modal Large Reasoning Model

Add code
Apr 29, 2025
Viaarxiv icon

AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection

Add code
Feb 18, 2025
Figure 1 for AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection
Figure 2 for AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection
Figure 3 for AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection
Figure 4 for AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection
Viaarxiv icon

InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models

Add code
Oct 30, 2024
Figure 1 for InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models
Figure 2 for InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models
Figure 3 for InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models
Figure 4 for InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models
Viaarxiv icon