Picture for Xiaodan Liang

Xiaodan Liang

SpatialDreamer: Incentivizing Spatial Reasoning via Active Mental Imagery

Add code
Dec 08, 2025
Viaarxiv icon

Video Spatial Reasoning with Object-Centric 3D Rollout

Add code
Nov 17, 2025
Figure 1 for Video Spatial Reasoning with Object-Centric 3D Rollout
Figure 2 for Video Spatial Reasoning with Object-Centric 3D Rollout
Figure 3 for Video Spatial Reasoning with Object-Centric 3D Rollout
Figure 4 for Video Spatial Reasoning with Object-Centric 3D Rollout
Viaarxiv icon

GRPO-Guard: Mitigating Implicit Over-Optimization in Flow Matching via Regulated Clipping

Add code
Oct 25, 2025
Viaarxiv icon

Aligning Perception, Reasoning, Modeling and Interaction: A Survey on Physical AI

Add code
Oct 06, 2025
Figure 1 for Aligning Perception, Reasoning, Modeling and Interaction: A Survey on Physical AI
Figure 2 for Aligning Perception, Reasoning, Modeling and Interaction: A Survey on Physical AI
Figure 3 for Aligning Perception, Reasoning, Modeling and Interaction: A Survey on Physical AI
Figure 4 for Aligning Perception, Reasoning, Modeling and Interaction: A Survey on Physical AI
Viaarxiv icon

Embodied Arena: A Comprehensive, Unified, and Evolving Evaluation Platform for Embodied AI

Add code
Sep 18, 2025
Viaarxiv icon

LaVieID: Local Autoregressive Diffusion Transformers for Identity-Preserving Video Creation

Add code
Aug 11, 2025
Figure 1 for LaVieID: Local Autoregressive Diffusion Transformers for Identity-Preserving Video Creation
Figure 2 for LaVieID: Local Autoregressive Diffusion Transformers for Identity-Preserving Video Creation
Figure 3 for LaVieID: Local Autoregressive Diffusion Transformers for Identity-Preserving Video Creation
Figure 4 for LaVieID: Local Autoregressive Diffusion Transformers for Identity-Preserving Video Creation
Viaarxiv icon

X-SAM: From Segment Anything to Any Segmentation

Add code
Aug 06, 2025
Viaarxiv icon

C2-Evo: Co-Evolving Multimodal Data and Model for Self-Improving Reasoning

Add code
Jul 22, 2025
Figure 1 for C2-Evo: Co-Evolving Multimodal Data and Model for Self-Improving Reasoning
Figure 2 for C2-Evo: Co-Evolving Multimodal Data and Model for Self-Improving Reasoning
Figure 3 for C2-Evo: Co-Evolving Multimodal Data and Model for Self-Improving Reasoning
Figure 4 for C2-Evo: Co-Evolving Multimodal Data and Model for Self-Improving Reasoning
Viaarxiv icon

3D-MoRe: Unified Modal-Contextual Reasoning for Embodied Question Answering

Add code
Jul 16, 2025
Figure 1 for 3D-MoRe: Unified Modal-Contextual Reasoning for Embodied Question Answering
Figure 2 for 3D-MoRe: Unified Modal-Contextual Reasoning for Embodied Question Answering
Figure 3 for 3D-MoRe: Unified Modal-Contextual Reasoning for Embodied Question Answering
Figure 4 for 3D-MoRe: Unified Modal-Contextual Reasoning for Embodied Question Answering
Viaarxiv icon

PhyBlock: A Progressive Benchmark for Physical Understanding and Planning via 3D Block Assembly

Add code
Jun 10, 2025
Viaarxiv icon