Picture for Qi Han

Qi Han

PRIME: A Process-Outcome Alignment Benchmark for Verifiable Reasoning in Mathematics and Engineering

Add code
Feb 12, 2026
Viaarxiv icon

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Add code
Feb 11, 2026
Viaarxiv icon

R-Align: Enhancing Generative Reward Models through Rationale-Centric Meta-Judging

Add code
Feb 06, 2026
Viaarxiv icon

Beyond Quantity: Trajectory Diversity Scaling for Code Agents

Add code
Feb 03, 2026
Viaarxiv icon

DockSmith: Scaling Reliable Coding Environments via an Agentic Docker Builder

Add code
Jan 31, 2026
Viaarxiv icon

STEP3-VL-10B Technical Report

Add code
Jan 15, 2026
Viaarxiv icon

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Add code
Jan 09, 2026
Viaarxiv icon

MMFormalizer: Multimodal Autoformalization in the Wild

Add code
Jan 06, 2026
Viaarxiv icon

Step-Audio 2 Technical Report

Add code
Jul 24, 2025
Figure 1 for Step-Audio 2 Technical Report
Figure 2 for Step-Audio 2 Technical Report
Figure 3 for Step-Audio 2 Technical Report
Figure 4 for Step-Audio 2 Technical Report
Viaarxiv icon

PhyX: Does Your Model Have the "Wits" for Physical Reasoning?

Add code
May 21, 2025
Figure 1 for PhyX: Does Your Model Have the "Wits" for Physical Reasoning?
Figure 2 for PhyX: Does Your Model Have the "Wits" for Physical Reasoning?
Figure 3 for PhyX: Does Your Model Have the "Wits" for Physical Reasoning?
Figure 4 for PhyX: Does Your Model Have the "Wits" for Physical Reasoning?
Viaarxiv icon