Picture for Yunxin Li

Yunxin Li

Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data

Add code
Nov 16, 2025
Viaarxiv icon

Look Before Switch: Sensing-Assisted Handover in 5G NR V2I Networks

Add code
Nov 07, 2025
Viaarxiv icon

A Question Answering Dataset for Temporal-Sensitive Retrieval-Augmented Generation

Add code
Aug 17, 2025
Viaarxiv icon

AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation

Add code
Jun 12, 2025
Viaarxiv icon

VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization

Add code
May 25, 2025
Figure 1 for VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization
Figure 2 for VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization
Figure 3 for VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization
Figure 4 for VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization
Viaarxiv icon

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Add code
May 08, 2025
Figure 1 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Figure 2 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Figure 3 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Figure 4 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Viaarxiv icon

VideoVista-CulturalLingo: 360$^\circ$ Horizons-Bridging Cultures, Languages, and Domains in Video Comprehension

Add code
Apr 23, 2025
Viaarxiv icon

Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents

Add code
Feb 27, 2025
Figure 1 for Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents
Figure 2 for Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents
Figure 3 for Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents
Figure 4 for Picking the Cream of the Crop: Visual-Centric Data Selection with Collaborative Agents
Viaarxiv icon

UI-TARS: Pioneering Automated GUI Interaction with Native Agents

Add code
Jan 21, 2025
Viaarxiv icon

Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation

Add code
Aug 19, 2024
Figure 1 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Figure 2 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Figure 3 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Figure 4 for Anim-Director: A Large Multimodal Model Powered Agent for Controllable Animation Video Generation
Viaarxiv icon