Picture for Ming Yan

Ming Yan

MUSEG: Reinforcing Video Temporal Understanding via Timestamp-Aware Multi-Segment Grounding

Add code
May 27, 2025
Viaarxiv icon

QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Add code
May 23, 2025
Viaarxiv icon

QwenLong-CPRS: Towards $\infty$-LLMs with Dynamic Context Optimization

Add code
May 23, 2025
Viaarxiv icon

VLM-R$^3$: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought

Add code
May 22, 2025
Viaarxiv icon

Mobile-Agent-V: A Video-Guided Approach for Effortless and Efficient Operational Knowledge Injection in Mobile Automation

Add code
May 21, 2025
Viaarxiv icon

MAGI-1: Autoregressive Video Generation at Scale

Add code
May 19, 2025
Figure 1 for MAGI-1: Autoregressive Video Generation at Scale
Figure 2 for MAGI-1: Autoregressive Video Generation at Scale
Figure 3 for MAGI-1: Autoregressive Video Generation at Scale
Figure 4 for MAGI-1: Autoregressive Video Generation at Scale
Viaarxiv icon

SoLoPO: Unlocking Long-Context Capabilities in LLMs via Short-to-Long Preference Optimization

Add code
May 16, 2025
Viaarxiv icon

Towards Efficient Online Tuning of VLM Agents via Counterfactual Soft Reinforcement Learning

Add code
May 01, 2025
Viaarxiv icon

DAE-KAN: A Kolmogorov-Arnold Network Model for High-Index Differential-Algebraic Equations

Add code
Apr 22, 2025
Viaarxiv icon

AdaMMS: Model Merging for Heterogeneous Multimodal Large Language Models with Unsupervised Coefficient Optimization

Add code
Mar 31, 2025
Viaarxiv icon