Picture for Yongyuan Liang

Yongyuan Liang

Anticipatory Planning for Multimodal AI Agents

Add code
Mar 17, 2026
Viaarxiv icon

Learning Situated Awareness in the Real World

Add code
Feb 18, 2026
Viaarxiv icon

Failure-Aware RL: Reliable Offline-to-Online Reinforcement Learning with Self-Recovery for Real-World Manipulation

Add code
Jan 12, 2026
Viaarxiv icon

MomaGraph: State-Aware Unified Scene Graphs with Vision-Language Model for Embodied Task Planning

Add code
Dec 18, 2025
Viaarxiv icon

Lemon: A Unified and Scalable 3D Multimodal Model for Universal Spatial Understanding

Add code
Dec 14, 2025
Viaarxiv icon

WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation

Add code
Nov 14, 2025
Figure 1 for WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation
Figure 2 for WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation
Figure 3 for WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation
Figure 4 for WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation
Viaarxiv icon

ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs

Add code
Jun 11, 2025
Viaarxiv icon

MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning

Add code
Jun 05, 2025
Viaarxiv icon

Magma: A Foundation Model for Multimodal AI Agents

Add code
Feb 18, 2025
Viaarxiv icon

TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies

Add code
Dec 13, 2024
Viaarxiv icon