Picture for Yuqing Yang

Yuqing Yang

Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs?

Add code
Mar 25, 2026
Viaarxiv icon

SortedRL: Accelerating RL Training for LLMs through Online Length-Aware Scheduling

Add code
Mar 24, 2026
Viaarxiv icon

Act While Thinking: Accelerating LLM Agents via Pattern-Aware Speculative Tool Execution

Add code
Mar 19, 2026
Viaarxiv icon

Understanding Reasoning in LLMs through Strategic Information Allocation under Uncertainty

Add code
Mar 16, 2026
Viaarxiv icon

Exploratory Memory-Augmented LLM Agent via Hybrid On- and Off-Policy Optimization

Add code
Feb 26, 2026
Viaarxiv icon

PosIR: Position-Aware Heterogeneous Information Retrieval Benchmark

Add code
Jan 13, 2026
Viaarxiv icon

MTraining: Distributed Dynamic Sparse Attention for Efficient Ultra-Long Context Training

Add code
Oct 21, 2025
Viaarxiv icon

$ΔL$ Normalization: Rethink Loss Aggregation in RLVR

Add code
Sep 09, 2025
Viaarxiv icon

Sky Background Building of Multi-objective Fiber spectra Based on Mutual Information Network

Add code
Aug 27, 2025
Viaarxiv icon

SecurityLingua: Efficient Defense of LLM Jailbreak Attacks via Security-Aware Prompt Compression

Add code
Jun 15, 2025
Viaarxiv icon