Picture for Dong Yu

Dong Yu

UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression

Add code
Sep 19, 2025
Viaarxiv icon

Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation

Add code
Sep 18, 2025
Viaarxiv icon

EconProver: Towards More Economical Test-Time Scaling for Automated Theorem Proving

Add code
Sep 16, 2025
Viaarxiv icon

CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models

Add code
Sep 11, 2025
Figure 1 for CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
Figure 2 for CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
Figure 3 for CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
Figure 4 for CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models
Viaarxiv icon

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Add code
Sep 09, 2025
Viaarxiv icon

Self-Rewarding Vision-Language Model via Reasoning Decomposition

Add code
Aug 27, 2025
Figure 1 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 2 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 3 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 4 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Viaarxiv icon

Audio-Thinker: Guiding Audio Language Model When and How to Think via Reinforcement Learning

Add code
Aug 12, 2025
Viaarxiv icon

Towards Hallucination-Free Music: A Reinforcement Learning Preference Optimization Framework for Reliable Song Generation

Add code
Aug 07, 2025
Viaarxiv icon

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Add code
Aug 07, 2025
Figure 1 for R-Zero: Self-Evolving Reasoning LLM from Zero Data
Figure 2 for R-Zero: Self-Evolving Reasoning LLM from Zero Data
Figure 3 for R-Zero: Self-Evolving Reasoning LLM from Zero Data
Figure 4 for R-Zero: Self-Evolving Reasoning LLM from Zero Data
Viaarxiv icon

Efficient Scaling for LLM-based ASR

Add code
Aug 06, 2025
Figure 1 for Efficient Scaling for LLM-based ASR
Figure 2 for Efficient Scaling for LLM-based ASR
Figure 3 for Efficient Scaling for LLM-based ASR
Figure 4 for Efficient Scaling for LLM-based ASR
Viaarxiv icon