Picture for Lifeng Shang

Lifeng Shang

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Add code
Dec 27, 2025
Viaarxiv icon

Memory-T1: Reinforcement Learning for Temporal Reasoning in Multi-session Agents

Add code
Dec 23, 2025
Viaarxiv icon

Rethinking Expert Trajectory Utilization in LLM Post-training

Add code
Dec 12, 2025
Viaarxiv icon

A1: Asynchronous Test-Time Scaling via Conformal Prediction

Add code
Sep 18, 2025
Figure 1 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Figure 2 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Figure 3 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Figure 4 for A1: Asynchronous Test-Time Scaling via Conformal Prediction
Viaarxiv icon

The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs

Add code
Jul 10, 2025
Figure 1 for The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs
Figure 2 for The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs
Figure 3 for The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs
Figure 4 for The Synergy Dilemma of Long-CoT SFT and RL: Investigating Post-Training Techniques for Reasoning VLMs
Viaarxiv icon

QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

Add code
Jun 15, 2025
Figure 1 for QFFT, Question-Free Fine-Tuning for Adaptive Reasoning
Figure 2 for QFFT, Question-Free Fine-Tuning for Adaptive Reasoning
Figure 3 for QFFT, Question-Free Fine-Tuning for Adaptive Reasoning
Figure 4 for QFFT, Question-Free Fine-Tuning for Adaptive Reasoning
Viaarxiv icon

Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification

Add code
Jun 05, 2025
Figure 1 for Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification
Figure 2 for Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification
Figure 3 for Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification
Figure 4 for Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification
Viaarxiv icon

Pangu DeepDiver: Adaptive Search Intensity Scaling via Open-Web Reinforcement Learning

Add code
May 30, 2025
Viaarxiv icon

Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning

Add code
May 28, 2025
Figure 1 for Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning
Figure 2 for Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning
Figure 3 for Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning
Figure 4 for Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning
Viaarxiv icon

Stepwise Reasoning Checkpoint Analysis: A Test Time Scaling Method to Enhance LLMs' Reasoning

Add code
May 23, 2025
Viaarxiv icon