Picture for Weizhu Chen

Weizhu Chen

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Add code
Jun 10, 2025
Viaarxiv icon

R&D-Agent: Automating Data-Driven AI Solution Building Through LLM-Powered Automated Research, Development, and Evolution

Add code
May 20, 2025
Viaarxiv icon

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Add code
Apr 30, 2025
Viaarxiv icon

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Add code
Apr 29, 2025
Viaarxiv icon

Scaling Laws of Synthetic Data for Language Models

Add code
Mar 26, 2025
Viaarxiv icon

LLMs Can Generate a Better Answer by Aggregating Their Own Responses

Add code
Mar 06, 2025
Viaarxiv icon

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Add code
Mar 03, 2025
Viaarxiv icon

LongRoPE2: Near-Lossless LLM Context Window Scaling

Add code
Feb 27, 2025
Figure 1 for LongRoPE2: Near-Lossless LLM Context Window Scaling
Figure 2 for LongRoPE2: Near-Lossless LLM Context Window Scaling
Figure 3 for LongRoPE2: Near-Lossless LLM Context Window Scaling
Figure 4 for LongRoPE2: Near-Lossless LLM Context Window Scaling
Viaarxiv icon

COSMOS: A Hybrid Adaptive Optimizer for Memory-Efficient Training of LLMs

Add code
Feb 26, 2025
Viaarxiv icon

Segmenting Text and Learning Their Rewards for Improved RLHF in Language Model

Add code
Jan 06, 2025
Viaarxiv icon