Picture for Sophie Lei

Sophie Lei

Layer-Isolated Evaluation: Gating the Deterministic Scaffold of a Production LLM Agent with a No-LLM, Regression-Locked Test Harness

Add code
Jun 10, 2026
Viaarxiv icon

Catching One in Five: LLM-as-Judge Blind Spots in Production Multi-Turn Transaction Agents

Add code
Jun 09, 2026
Viaarxiv icon