Picture for Weizhu Chen

Weizhu Chen

Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation

Add code
Jul 09, 2025
Figure 1 for Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Figure 2 for Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Figure 3 for Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Figure 4 for Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation
Viaarxiv icon

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Add code
Jun 10, 2025
Viaarxiv icon

R&D-Agent: Automating Data-Driven AI Solution Building Through LLM-Powered Automated Research, Development, and Evolution

Add code
May 20, 2025
Viaarxiv icon

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Add code
Apr 30, 2025
Figure 1 for Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math
Figure 2 for Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math
Figure 3 for Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math
Figure 4 for Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math
Viaarxiv icon

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Add code
Apr 29, 2025
Figure 1 for Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Figure 2 for Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Figure 3 for Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Figure 4 for Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Viaarxiv icon

Scaling Laws of Synthetic Data for Language Models

Add code
Mar 26, 2025
Figure 1 for Scaling Laws of Synthetic Data for Language Models
Figure 2 for Scaling Laws of Synthetic Data for Language Models
Figure 3 for Scaling Laws of Synthetic Data for Language Models
Figure 4 for Scaling Laws of Synthetic Data for Language Models
Viaarxiv icon

LLMs Can Generate a Better Answer by Aggregating Their Own Responses

Add code
Mar 06, 2025
Viaarxiv icon

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Add code
Mar 03, 2025
Figure 1 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Figure 2 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Figure 3 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Figure 4 for Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs
Viaarxiv icon

LongRoPE2: Near-Lossless LLM Context Window Scaling

Add code
Feb 27, 2025
Figure 1 for LongRoPE2: Near-Lossless LLM Context Window Scaling
Figure 2 for LongRoPE2: Near-Lossless LLM Context Window Scaling
Figure 3 for LongRoPE2: Near-Lossless LLM Context Window Scaling
Figure 4 for LongRoPE2: Near-Lossless LLM Context Window Scaling
Viaarxiv icon

COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs

Add code
Feb 26, 2025
Viaarxiv icon