Picture for Lingming Zhang

Lingming Zhang

SEC-bench Pro: Can Language Models Solve Long-Horizon Software Security Tasks?

Add code
May 26, 2026
Viaarxiv icon

Code as Agent Harness

Add code
May 18, 2026
Viaarxiv icon

Beyond Isolation: A Unified Benchmark for General-Purpose Navigation

Add code
May 10, 2026
Viaarxiv icon

Agentic Vulnerability Reasoning on Windows COM Binaries

Add code
May 06, 2026
Viaarxiv icon

AnyPoC: Universal Proof-of-Concept Test Generation for Scalable LLM-Based Bug Detection

Add code
Apr 13, 2026
Viaarxiv icon

CAPTCHA Solving for Native GUI Agents: Automated Reasoning-Action Data Generation and Self-Corrective Training

Add code
Mar 23, 2026
Viaarxiv icon

SWE-Replay: Efficient Test-Time Scaling for Software Engineering Agents

Add code
Jan 29, 2026
Viaarxiv icon

InfiniteVGGT: Visual Geometry Grounded Transformer for Endless Streams

Add code
Jan 05, 2026
Viaarxiv icon

Toward Training Superintelligent Software Agents through Self-Play SWE-RL

Add code
Dec 21, 2025
Viaarxiv icon

Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?

Add code
Nov 17, 2025
Figure 1 for Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?
Figure 2 for Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?
Figure 3 for Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?
Figure 4 for Live-SWE-agent: Can Software Engineering Agents Self-Evolve on the Fly?
Viaarxiv icon