Picture for Jiaheng Liu

Jiaheng Liu

The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

Add code
Jan 13, 2026
Viaarxiv icon

Thinking-Based Non-Thinking: Solving the Reward Hacking Problem in Training Hybrid Reasoning Models via Reinforcement Learning

Add code
Jan 08, 2026
Viaarxiv icon

Encyclo-K: Evaluating LLMs with Dynamically Composed Knowledge Statements

Add code
Dec 31, 2025
Viaarxiv icon

T2AV-Compass: Towards Unified Evaluation for Text-to-Audio-Video Generation

Add code
Dec 24, 2025
Viaarxiv icon

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

Add code
Dec 18, 2025
Viaarxiv icon

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

Add code
Dec 14, 2025
Viaarxiv icon

AutoMV: An Automatic Multi-Agent System for Music Video Generation

Add code
Dec 13, 2025
Viaarxiv icon

MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs

Add code
Nov 13, 2025
Figure 1 for MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs
Figure 2 for MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs
Figure 3 for MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs
Figure 4 for MVU-Eval: Towards Multi-Video Understanding Evaluation for Multimodal LLMs
Viaarxiv icon

SWE-Compass: Towards Unified Evaluation of Agentic Coding Abilities for Large Language Models

Add code
Nov 07, 2025
Viaarxiv icon

Scaling Latent Reasoning via Looped Language Models

Add code
Oct 29, 2025
Figure 1 for Scaling Latent Reasoning via Looped Language Models
Figure 2 for Scaling Latent Reasoning via Looped Language Models
Figure 3 for Scaling Latent Reasoning via Looped Language Models
Figure 4 for Scaling Latent Reasoning via Looped Language Models
Viaarxiv icon