Picture for Yixia Li

Yixia Li

SPPO: Sequence-Level PPO for Long-Horizon Reasoning Tasks

Add code
Apr 10, 2026
Viaarxiv icon

Towards Fair and Comprehensive Evaluation of Routers in Collaborative LLM Systems

Add code
Feb 12, 2026
Viaarxiv icon

Anchored Policy Optimization: Mitigating Exploration Collapse Via Support-Constrained Rectification

Add code
Feb 05, 2026
Viaarxiv icon

Rethinking the Role of Entropy in Optimizing Tool-Use Behaviors for Large Language Model Agents

Add code
Feb 02, 2026
Viaarxiv icon

From Abstract to Contextual: What LLMs Still Cannot Do in Mathematics

Add code
Jan 30, 2026
Viaarxiv icon

No More Stale Feedback: Co-Evolving Critics for Open-World Agent Learning

Add code
Jan 11, 2026
Viaarxiv icon

From Word to World: Can Large Language Models be Implicit Text-based World Models?

Add code
Dec 21, 2025
Viaarxiv icon

VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models

Add code
Aug 13, 2025
Viaarxiv icon

ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs

Add code
Apr 17, 2025
Viaarxiv icon

LayAlign: Enhancing Multilingual Reasoning in Large Language Models via Layer-Wise Adaptive Fusion and Alignment Strategy

Add code
Feb 17, 2025
Figure 1 for LayAlign: Enhancing Multilingual Reasoning in Large Language Models via Layer-Wise Adaptive Fusion and Alignment Strategy
Figure 2 for LayAlign: Enhancing Multilingual Reasoning in Large Language Models via Layer-Wise Adaptive Fusion and Alignment Strategy
Figure 3 for LayAlign: Enhancing Multilingual Reasoning in Large Language Models via Layer-Wise Adaptive Fusion and Alignment Strategy
Figure 4 for LayAlign: Enhancing Multilingual Reasoning in Large Language Models via Layer-Wise Adaptive Fusion and Alignment Strategy
Viaarxiv icon