Picture for Mingxiao Li

Mingxiao Li

RLVMR: Reinforcement Learning with Verifiable Meta-Reasoning Rewards for Robust Long-Horizon Agents

Add code
Jul 30, 2025
Viaarxiv icon

Step-Audio 2 Technical Report

Add code
Jul 24, 2025
Viaarxiv icon

Consistent Story Generation with Asymmetry Zigzag Sampling

Add code
Jun 12, 2025
Viaarxiv icon

Mitigating Negative Interference in Multilingual Sequential Knowledge Editing through Null-Space Constraints

Add code
Jun 12, 2025
Viaarxiv icon

VISTA: Enhancing Vision-Text Alignment in MLLMs via Cross-Modal Mutual Information Maximization

Add code
May 19, 2025
Viaarxiv icon

Towards More Accurate Personalized Image Generation: Addressing Overfitting and Evaluation Bias

Add code
Mar 09, 2025
Viaarxiv icon

On a Connection Between Imitation Learning and RLHF

Add code
Mar 07, 2025
Viaarxiv icon

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Add code
Feb 18, 2025
Viaarxiv icon

From Visuals to Vocabulary: Establishing Equivalence Between Image and Text Token Through Autoregressive Pre-training in MLLMs

Add code
Feb 13, 2025
Viaarxiv icon

SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters

Add code
Feb 04, 2025
Figure 1 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 2 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 3 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Figure 4 for SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Viaarxiv icon