Picture for Xiangyu Yue

Xiangyu Yue

A Progressive Training Strategy for Vision-Language Models to Counteract Spatio-Temporal Hallucinations in Embodied Reasoning

Add code
Apr 12, 2026
Viaarxiv icon

The Latent Space: Foundation, Evolution, Mechanism, Ability, and Outlook

Add code
Apr 02, 2026
Viaarxiv icon

Gen-Searcher: Reinforcing Agentic Search for Image Generation

Add code
Mar 30, 2026
Viaarxiv icon

GIDE: Unlocking Diffusion LLMs for Precise Training-Free Image Editing

Add code
Mar 22, 2026
Viaarxiv icon

FailureMem: A Failure-Aware Multimodal Framework for Autonomous Software Repair

Add code
Mar 18, 2026
Viaarxiv icon

PreciseCache: Precise Feature Caching for Efficient and High-fidelity Video Generation

Add code
Mar 03, 2026
Viaarxiv icon

BitDance: Scaling Autoregressive Generative Models with Binary Tokens

Add code
Feb 15, 2026
Viaarxiv icon

Elastic Diffusion Transformer

Add code
Feb 15, 2026
Viaarxiv icon

UniWeTok: An Unified Binary Tokenizer with Codebook Size $\mathit{2^{128}}$ for Unified Multimodal Large Language Model

Add code
Feb 15, 2026
Viaarxiv icon

RISE: Self-Improving Robot Policy with Compositional World Model

Add code
Feb 11, 2026
Viaarxiv icon