Picture for Qizhi Pei

Qizhi Pei

Tracing the Roots: A Multi-Agent Framework for Uncovering Data Lineage in Post-Training LLMs

Add code
Apr 12, 2026
Viaarxiv icon

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Add code
Mar 26, 2026
Viaarxiv icon

ChartVerse: Scaling Chart Reasoning via Reliable Programmatic Synthesis from Scratch

Add code
Jan 20, 2026
Viaarxiv icon

Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility

Add code
Jan 17, 2026
Viaarxiv icon

OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value

Add code
Dec 16, 2025
Figure 1 for OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value
Figure 2 for OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value
Figure 3 for OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value
Figure 4 for OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value
Viaarxiv icon

Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning

Add code
Aug 29, 2025
Figure 1 for Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning
Figure 2 for Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning
Figure 3 for Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning
Figure 4 for Middo: Model-Informed Dynamic Data Optimization for Enhanced LLM Fine-Tuning via Closed-Loop Learning
Viaarxiv icon

IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment

Add code
May 19, 2025
Figure 1 for IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment
Figure 2 for IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment
Figure 3 for IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment
Figure 4 for IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment
Viaarxiv icon

CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges

Add code
Apr 27, 2025
Figure 1 for CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges
Figure 2 for CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges
Figure 3 for CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges
Figure 4 for CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges
Viaarxiv icon

MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion

Add code
Mar 20, 2025
Viaarxiv icon

MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer

Add code
Mar 19, 2025
Viaarxiv icon