Picture for Xinyu Chen

Xinyu Chen

Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data

Add code
Nov 16, 2025
Viaarxiv icon

FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction

Add code
Nov 07, 2025
Figure 1 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Figure 2 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Figure 3 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Figure 4 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Viaarxiv icon

Medical Referring Image Segmentation via Next-Token Mask Prediction

Add code
Nov 07, 2025
Viaarxiv icon

Self-Supervised Continuous Colormap Recovery from a 2D Scalar Field Visualization without a Legend

Add code
Jul 28, 2025
Viaarxiv icon

AnTKV: Anchor Token-Aware Sub-Bit Vector Quantization for KV Cache in Large Language Models

Add code
Jun 24, 2025
Viaarxiv icon

AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation

Add code
Jun 12, 2025
Viaarxiv icon

SIV-Bench: A Video Benchmark for Social Interaction Understanding and Reasoning

Add code
Jun 05, 2025
Viaarxiv icon

VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization

Add code
May 25, 2025
Figure 1 for VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization
Figure 2 for VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization
Figure 3 for VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization
Figure 4 for VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization
Viaarxiv icon

From Air to Wear: Personalized 3D Digital Fashion with AR/VR Immersive 3D Sketching

Add code
May 15, 2025
Viaarxiv icon

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Add code
May 08, 2025
Figure 1 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Figure 2 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Figure 3 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Figure 4 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Viaarxiv icon