Picture for Jiaheng Liu

Jiaheng Liu

SWE-Compass: Towards Unified Evaluation of Agentic Coding Abilities for Large Language Models

Add code
Nov 07, 2025
Viaarxiv icon

Scaling Latent Reasoning via Looped Language Models

Add code
Oct 29, 2025
Viaarxiv icon

Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions?

Add code
Sep 04, 2025
Viaarxiv icon

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Add code
Aug 11, 2025
Viaarxiv icon

IFEvalCode: Controlled Code Generation

Add code
Jul 30, 2025
Viaarxiv icon

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

Add code
Jul 08, 2025
Viaarxiv icon

A Survey on Latent Reasoning

Add code
Jul 08, 2025
Figure 1 for A Survey on Latent Reasoning
Figure 2 for A Survey on Latent Reasoning
Figure 3 for A Survey on Latent Reasoning
Figure 4 for A Survey on Latent Reasoning
Viaarxiv icon

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

Add code
Jul 08, 2025
Figure 1 for CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization
Figure 2 for CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization
Figure 3 for CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization
Figure 4 for CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization
Viaarxiv icon

Scaling Test-time Compute for LLM Agents

Add code
Jun 15, 2025
Viaarxiv icon

TaskCraft: Automated Generation of Agentic Tasks

Add code
Jun 11, 2025
Viaarxiv icon