Picture for Xiangyu Yue

Xiangyu Yue

SpaceVista: All-Scale Visual Spatial Reasoning from mm to km

Add code
Oct 10, 2025
Viaarxiv icon

Growing Visual Generative Capacity for Pre-Trained MLLMs

Add code
Oct 02, 2025
Viaarxiv icon

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Add code
Sep 18, 2025
Figure 1 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 2 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 3 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 4 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Viaarxiv icon

Transition Models: Rethinking the Generative Learning Objective

Add code
Sep 04, 2025
Figure 1 for Transition Models: Rethinking the Generative Learning Objective
Figure 2 for Transition Models: Rethinking the Generative Learning Objective
Figure 3 for Transition Models: Rethinking the Generative Learning Objective
Figure 4 for Transition Models: Rethinking the Generative Learning Objective
Viaarxiv icon

ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents

Add code
Jul 30, 2025
Viaarxiv icon

MSR-Align: Policy-Grounded Multimodal Alignment for Safety-Aware Reasoning in Vision-Language Models

Add code
Jun 24, 2025
Viaarxiv icon

Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations

Add code
Jun 23, 2025
Figure 1 for Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Figure 2 for Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Figure 3 for Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Figure 4 for Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations
Viaarxiv icon

Pushing the Limits of Safety: A Technical Report on the ATLAS Challenge 2025

Add code
Jun 14, 2025
Viaarxiv icon

ReSim: Reliable World Simulation for Autonomous Driving

Add code
Jun 11, 2025
Viaarxiv icon

MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence

Add code
May 29, 2025
Viaarxiv icon