Picture for Yang Zhang

Yang Zhang

University of Science and Technology of China

Beyond Random Sampling: Efficient Language Model Pretraining via Curriculum Learning

Add code
Jun 12, 2025
Viaarxiv icon

PerfTracker: Online Performance Troubleshooting for Large-scale Model Training in Production

Add code
Jun 12, 2025
Figure 1 for PerfTracker: Online Performance Troubleshooting for Large-scale Model Training in Production
Figure 2 for PerfTracker: Online Performance Troubleshooting for Large-scale Model Training in Production
Figure 3 for PerfTracker: Online Performance Troubleshooting for Large-scale Model Training in Production
Figure 4 for PerfTracker: Online Performance Troubleshooting for Large-scale Model Training in Production
Viaarxiv icon

A Hierarchical Probabilistic Framework for Incremental Knowledge Tracing in Classroom Settings

Add code
Jun 11, 2025
Viaarxiv icon

The Emergence of Abstract Thought in Large Language Models Beyond Any Language

Add code
Jun 11, 2025
Viaarxiv icon

SoK: Data Reconstruction Attacks Against Machine Learning Models: Definition, Metrics, and Benchmark

Add code
Jun 09, 2025
Viaarxiv icon

Matryoshka Model Learning for Improved Elastic Student Models

Add code
May 29, 2025
Viaarxiv icon

R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning

Add code
May 27, 2025
Figure 1 for R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning
Figure 2 for R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning
Figure 3 for R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning
Figure 4 for R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning
Viaarxiv icon

Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective

Add code
May 27, 2025
Figure 1 for Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective
Figure 2 for Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective
Figure 3 for Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective
Figure 4 for Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective
Viaarxiv icon

PAMD: Plausibility-Aware Motion Diffusion Model for Long Dance Generation

Add code
May 26, 2025
Viaarxiv icon

Token-level Accept or Reject: A Micro Alignment Approach for Large Language Models

Add code
May 26, 2025
Viaarxiv icon