Picture for Cheng Qian

Cheng Qian

May

From Word to World: Can Large Language Models be Implicit Text-based World Models?

Add code
Dec 21, 2025
Viaarxiv icon

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Add code
Dec 18, 2025
Viaarxiv icon

LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering

Add code
Nov 17, 2025
Figure 1 for LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
Figure 2 for LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
Figure 3 for LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
Figure 4 for LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering
Viaarxiv icon

Self-Improving LLM Agents at Test-Time

Add code
Oct 09, 2025
Figure 1 for Self-Improving LLM Agents at Test-Time
Figure 2 for Self-Improving LLM Agents at Test-Time
Figure 3 for Self-Improving LLM Agents at Test-Time
Figure 4 for Self-Improving LLM Agents at Test-Time
Viaarxiv icon

Veri-R1: Toward Precise and Faithful Claim Verification via Online Reinforcement Learning

Add code
Oct 02, 2025
Viaarxiv icon

LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering

Add code
Sep 11, 2025
Figure 1 for LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Figure 2 for LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Figure 3 for LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Figure 4 for LoCoBench: A Benchmark for Long-Context Large Language Models in Complex Software Engineering
Viaarxiv icon

ISACL: Internal State Analyzer for Copyrighted Training Data Leakage

Add code
Aug 25, 2025
Figure 1 for ISACL: Internal State Analyzer for Copyrighted Training Data Leakage
Figure 2 for ISACL: Internal State Analyzer for Copyrighted Training Data Leakage
Figure 3 for ISACL: Internal State Analyzer for Copyrighted Training Data Leakage
Figure 4 for ISACL: Internal State Analyzer for Copyrighted Training Data Leakage
Viaarxiv icon

UserBench: An Interactive Gym Environment for User-Centric Agents

Add code
Jul 29, 2025
Figure 1 for UserBench: An Interactive Gym Environment for User-Centric Agents
Figure 2 for UserBench: An Interactive Gym Environment for User-Centric Agents
Figure 3 for UserBench: An Interactive Gym Environment for User-Centric Agents
Figure 4 for UserBench: An Interactive Gym Environment for User-Centric Agents
Viaarxiv icon

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence

Add code
Jul 28, 2025
Figure 1 for A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence
Figure 2 for A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence
Figure 3 for A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence
Figure 4 for A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence
Viaarxiv icon

Atomic Reasoning for Scientific Table Claim Verification

Add code
Jun 08, 2025
Viaarxiv icon