Picture for Chenglin Wu

Chenglin Wu

Co-Evolution of Policy and Internal Reward for Language Agents

Add code
Apr 03, 2026
Viaarxiv icon

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Add code
Apr 02, 2026
Viaarxiv icon

InfoPO: Information-Driven Policy Optimization for User-Centric Agents

Add code
Feb 28, 2026
Viaarxiv icon

AutoWebWorld: Synthesizing Infinite Verifiable Web Environments via Finite State Machines

Add code
Feb 15, 2026
Viaarxiv icon

AOrchestra: Automating Sub-Agent Creation for Agentic Orchestration

Add code
Feb 03, 2026
Viaarxiv icon

Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks

Add code
Nov 19, 2025
Figure 1 for Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Figure 2 for Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Figure 3 for Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Figure 4 for Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Viaarxiv icon

VisJudge-Bench: Aesthetics and Quality Assessment of Visualizations

Add code
Oct 25, 2025
Viaarxiv icon

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

Add code
Jul 08, 2025
Viaarxiv icon

Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems

Add code
Mar 31, 2025
Viaarxiv icon

MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning

Add code
Mar 10, 2025
Figure 1 for MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning
Figure 2 for MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning
Figure 3 for MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning
Figure 4 for MedAgentsBench: Benchmarking Thinking Models and Agent Frameworks for Complex Medical Reasoning
Viaarxiv icon