Picture for Lyumanshan Ye

Lyumanshan Ye

AlphaEval: Evaluating Agents in Production

Add code
Apr 14, 2026
Viaarxiv icon

Rubrics to Tokens: Bridging Response-level Rubrics and Token-level Rewards in Instruction Following Tasks

Add code
Apr 03, 2026
Viaarxiv icon

ASI-Evolve: AI Accelerates AI

Add code
Mar 31, 2026
Viaarxiv icon

Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model

Add code
Mar 23, 2026
Viaarxiv icon

ProjDevBench: Benchmarking AI Coding Agents on End-to-End Project Development

Add code
Feb 02, 2026
Viaarxiv icon

daVinci-Dev: Agent-native Mid-training for Software Engineering

Add code
Jan 27, 2026
Viaarxiv icon

DeepPersona: A Generative Engine for Scaling Deep Synthetic Personas

Add code
Nov 11, 2025
Viaarxiv icon

InnovatorBench: Evaluating Agents' Ability to Conduct Innovative LLM Research

Add code
Nov 03, 2025
Figure 1 for InnovatorBench: Evaluating Agents' Ability to Conduct Innovative LLM Research
Figure 2 for InnovatorBench: Evaluating Agents' Ability to Conduct Innovative LLM Research
Figure 3 for InnovatorBench: Evaluating Agents' Ability to Conduct Innovative LLM Research
Figure 4 for InnovatorBench: Evaluating Agents' Ability to Conduct Innovative LLM Research
Viaarxiv icon

Interaction as Intelligence Part II: Asynchronous Human-Agent Rollout for Long-Horizon Task Training

Add code
Nov 03, 2025
Viaarxiv icon

Context Engineering 2.0: The Context of Context Engineering

Add code
Oct 30, 2025
Figure 1 for Context Engineering 2.0: The Context of Context Engineering
Figure 2 for Context Engineering 2.0: The Context of Context Engineering
Figure 3 for Context Engineering 2.0: The Context of Context Engineering
Figure 4 for Context Engineering 2.0: The Context of Context Engineering
Viaarxiv icon