Picture for Daxin Jiang

Daxin Jiang

Unearthing Gems from Stones: Policy Optimization with Negative Sample Augmentation for LLM Reasoning

Add code
May 20, 2025
Viaarxiv icon

Beyond the First Error: Process Reward Models for Reflective Mathematical Reasoning

Add code
May 20, 2025
Viaarxiv icon

Step1X-3D: Towards High-Fidelity and Controllable Generation of Textured 3D Assets

Add code
May 12, 2025
Viaarxiv icon

DialogueReason: Rule-Based RL Sparks Dialogue Reasoning in LLMs

Add code
May 11, 2025
Viaarxiv icon

Step1X-Edit: A Practical Framework for General Image Editing

Add code
Apr 24, 2025
Viaarxiv icon

StreamRL: Scalable, Heterogeneous, and Elastic RL for LLMs with Disaggregated Stream Generation

Add code
Apr 22, 2025
Viaarxiv icon

PipeWeaver: Addressing Data Dynamicity in Large Multimodal Model Training with Dynamic Interleaved Pipeline

Add code
Apr 19, 2025
Viaarxiv icon

Perception-R1: Pioneering Perception Policy with Reinforcement Learning

Add code
Apr 10, 2025
Viaarxiv icon

Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Add code
Mar 31, 2025
Viaarxiv icon

M-DocSum: Do LVLMs Genuinely Comprehend Interleaved Image-Text in Document Summarization?

Add code
Mar 27, 2025
Viaarxiv icon