Picture for Xinyu Chen

Xinyu Chen

FashionStylist: An Expert Knowledge-enhanced Multimodal Dataset for Fashion Understanding

Add code
Apr 13, 2026
Viaarxiv icon

A Layer-wise Analysis of Supervised Fine-Tuning

Add code
Apr 12, 2026
Viaarxiv icon

INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling

Add code
Apr 08, 2026
Viaarxiv icon

MSVBench: Towards Human-Level Evaluation of Multi-Shot Video Generation

Add code
Feb 27, 2026
Viaarxiv icon

Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data

Add code
Nov 16, 2025
Viaarxiv icon

Medical Referring Image Segmentation via Next-Token Mask Prediction

Add code
Nov 07, 2025
Viaarxiv icon

FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction

Add code
Nov 07, 2025
Figure 1 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Figure 2 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Figure 3 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Figure 4 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Viaarxiv icon

Self-Supervised Continuous Colormap Recovery from a 2D Scalar Field Visualization without a Legend

Add code
Jul 28, 2025
Viaarxiv icon

AnTKV: Anchor Token-Aware Sub-Bit Vector Quantization for KV Cache in Large Language Models

Add code
Jun 24, 2025
Viaarxiv icon

AniMaker: Automated Multi-Agent Animated Storytelling with MCTS-Driven Clip Generation

Add code
Jun 12, 2025
Viaarxiv icon