Picture for Yexin Liu

Yexin Liu

OmniGen2: Exploration to Advanced Multimodal Generation

Add code
Jun 23, 2025
Viaarxiv icon

Enhancing Vector Quantization with Distributional Matching: A Theoretical and Empirical Study

Add code
Jun 18, 2025
Viaarxiv icon

Leader360V: The Large-scale, Real-world 360 Video Dataset for Multi-task Learning in Diverse Environment

Add code
Jun 17, 2025
Viaarxiv icon

When Semantics Mislead Vision: Mitigating Large Multimodal Models Hallucinations in Scene Text Spotting and Understanding

Add code
Jun 05, 2025
Viaarxiv icon

Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

Add code
Apr 04, 2025
Viaarxiv icon

VideoGen-of-Thought: Step-by-step generating multi-shot video with minimal manual intervention

Add code
Mar 20, 2025
Viaarxiv icon

ScalingNoise: Scaling Inference-Time Search for Generating Infinite Videos

Add code
Mar 20, 2025
Viaarxiv icon

Temporal Regularization Makes Your Video Generator Stronger

Add code
Mar 19, 2025
Viaarxiv icon

LightGen: Efficient Image Generation through Knowledge Distillation and Direct Preference Optimization

Add code
Mar 11, 2025
Viaarxiv icon

MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation

Add code
Feb 17, 2025
Viaarxiv icon