Picture for Yan Lu

Yan Lu

From Virtual Games to Real-World Play

Add code
Jun 23, 2025
Viaarxiv icon

StreamMel: Real-Time Zero-shot Text-to-Speech via Interleaved Continuous Autoregressive Modeling

Add code
Jun 14, 2025
Viaarxiv icon

Interaction, Process, Infrastructure: A Unified Architecture for Human-Agent Collaboration

Add code
Jun 13, 2025
Viaarxiv icon

Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning

Add code
Jun 12, 2025
Viaarxiv icon

Perfecting Depth: Uncertainty-Aware Enhancement of Metric Depth

Add code
Jun 05, 2025
Viaarxiv icon

LTM3D: Bridging Token Spaces for Conditional 3D Generation with Auto-Regressive Diffusion Framework

Add code
May 30, 2025
Viaarxiv icon

UI-Evol: Automatic Knowledge Evolving for Computer Use Agents

Add code
May 28, 2025
Viaarxiv icon

Text-Queried Audio Source Separation via Hierarchical Modeling

Add code
May 27, 2025
Viaarxiv icon

Zero-Shot Streaming Text to Speech Synthesis with Transducer and Auto-Regressive Modeling

Add code
May 26, 2025
Viaarxiv icon

Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding

Add code
May 23, 2025
Viaarxiv icon