Codex Large


SymptomWise: A Deterministic Reasoning Layer for Reliable and Efficient AI Systems

Add code
Apr 07, 2026
Viaarxiv icon

Graph of Skills: Dependency-Aware Structural Retrieval for Massive Agent Skills

Add code
Apr 07, 2026
Viaarxiv icon

Investigating Autonomous Agent Contributions in the Wild: Activity Patterns and Code Change over Time

Add code
Apr 01, 2026
Viaarxiv icon

PRBench: End-to-end Paper Reproduction in Physics Research

Add code
Mar 29, 2026
Viaarxiv icon

PostTrainBench: Can LLM Agents Automate LLM Post-Training?

Add code
Mar 10, 2026
Viaarxiv icon

ESAA: Event Sourcing for Autonomous Agents in LLM-Based Software Engineering

Add code
Feb 26, 2026
Viaarxiv icon

AIDev: Studying AI Coding Agents on GitHub

Add code
Feb 09, 2026
Viaarxiv icon

SWE-AGI: Benchmarking Specification-Driven Software Construction with MoonBit in the Era of Autonomous Agents

Add code
Feb 11, 2026
Viaarxiv icon

Automated QoR improvement in OpenROAD with coding agents

Add code
Jan 09, 2026
Viaarxiv icon

130k Lines of Formal Topology in Two Weeks: Simple and Cheap Autoformalization for Everyone?

Add code
Jan 06, 2026
Viaarxiv icon