Picture for Hamed Hassani

Hamed Hassani

Stateful Online Monitoring Catches Distributed Agent Attacks

Add code
May 29, 2026
Viaarxiv icon

InfoSFT: Learn More and Forget Less with Information-Aware Token Weighting

Add code
May 14, 2026
Viaarxiv icon

REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations

Add code
May 12, 2026
Viaarxiv icon

Risk-Controlled Post-Processing of Decision Policies

Add code
May 07, 2026
Viaarxiv icon

Detecting Safety Violations Across Many Agent Traces

Add code
Apr 13, 2026
Viaarxiv icon

Contextual Safety Reasoning and Grounding for Open-World Robots

Add code
Feb 24, 2026
Viaarxiv icon

Multi-Round Human-AI Collaboration with User-Specified Requirements

Add code
Feb 19, 2026
Viaarxiv icon

When to Trust the Cheap Check: Weak and Strong Verification for Reasoning

Add code
Feb 19, 2026
Viaarxiv icon

Robust Policy Optimization to Prevent Catastrophic Forgetting

Add code
Feb 09, 2026
Viaarxiv icon

Towards Reducible Uncertainty Modeling for Reliable Large Language Model Agents

Add code
Feb 04, 2026
Viaarxiv icon