Picture for Qi Han

Qi Han

MathGen: Revealing the Illusion of Mathematical Competence through Text-to-Image Generation

Add code
Mar 31, 2026
Viaarxiv icon

Failure Modes for Deep Learning-Based Online Mapping: How to Measure and Address Them

Add code
Mar 20, 2026
Viaarxiv icon

MMSpec: Benchmarking Speculative Decoding for Vision-Language Models

Add code
Mar 16, 2026
Viaarxiv icon

PRIME: A Process-Outcome Alignment Benchmark for Verifiable Reasoning in Mathematics and Engineering

Add code
Feb 12, 2026
Viaarxiv icon

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Add code
Feb 11, 2026
Viaarxiv icon

R-Align: Enhancing Generative Reward Models through Rationale-Centric Meta-Judging

Add code
Feb 06, 2026
Viaarxiv icon

Beyond Quantity: Trajectory Diversity Scaling for Code Agents

Add code
Feb 03, 2026
Viaarxiv icon

DockSmith: Scaling Reliable Coding Environments via an Agentic Docker Builder

Add code
Jan 31, 2026
Viaarxiv icon

STEP3-VL-10B Technical Report

Add code
Jan 15, 2026
Viaarxiv icon

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Add code
Jan 09, 2026
Viaarxiv icon