Picture for Xiaodan Liang

Xiaodan Liang

MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents

Add code
May 26, 2025
Viaarxiv icon

SeePhys: Does Seeing Help Thinking? -- Benchmarking Vision-Based Physics Reasoning

Add code
May 25, 2025
Viaarxiv icon

BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation

Add code
May 11, 2025
Figure 1 for BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation
Figure 2 for BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation
Figure 3 for BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation
Figure 4 for BridgeIV: Bridging Customized Image and Video Generation through Test-Time Autoregressive Identity Propagation
Viaarxiv icon

CombiBench: Benchmarking LLM Capability for Combinatorial Mathematics

Add code
May 06, 2025
Viaarxiv icon

RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation

Add code
May 03, 2025
Viaarxiv icon

SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning

Add code
Apr 27, 2025
Viaarxiv icon

A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation

Add code
Apr 21, 2025
Figure 1 for A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
Figure 2 for A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
Figure 3 for A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
Figure 4 for A0: An Affordance-Aware Hierarchical Model for General Robotic Manipulation
Viaarxiv icon

FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model

Add code
Mar 25, 2025
Viaarxiv icon

Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models

Add code
Mar 24, 2025
Figure 1 for Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models
Figure 2 for Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models
Figure 3 for Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models
Figure 4 for Video SimpleQA: Towards Factuality Evaluation in Large Video Language Models
Viaarxiv icon

WISA: World Simulator Assistant for Physics-Aware Text-to-Video Generation

Add code
Mar 11, 2025
Viaarxiv icon