Picture for Bendong Jiang

Bendong Jiang

LITMUS: Benchmarking Behavioral Jailbreaks of LLM Agents in Real OS Environments

Add code
May 11, 2026
Viaarxiv icon