Picture for Xiangyang Xue

Xiangyang Xue

Fudan University

A Neural Representation Framework with LLM-Driven Spatial Reasoning for Open-Vocabulary 3D Visual Grounding

Add code
Jul 09, 2025
Viaarxiv icon

Spatial-Temporal Aware Visuomotor Diffusion Policy Learning

Add code
Jul 09, 2025
Viaarxiv icon

CrowdTrack: A Benchmark for Difficult Multiple Pedestrian Tracking in Real Scenarios

Add code
Jul 03, 2025
Viaarxiv icon

TriVLA: A Triple-System-Based Unified Vision-Language-Action Model for General Robot Control

Add code
Jul 03, 2025
Viaarxiv icon

TriVLA: A Unified Triple-System-Based Unified Vision-Language-Action Model for General Robot Control

Add code
Jul 02, 2025
Viaarxiv icon

RAG-6DPose: Retrieval-Augmented 6D Pose Estimation via Leveraging CAD as Knowledge Base

Add code
Jun 23, 2025
Viaarxiv icon

You Only Estimate Once: Unified, One-stage, Real-Time Category-level Articulated Object 6D Pose Estimation for Robotic Grasping

Add code
Jun 06, 2025
Viaarxiv icon

CoLa: Chinese Character Decomposition with Compositional Latent Components

Add code
Jun 04, 2025
Viaarxiv icon

ELA-ZSON: Efficient Layout-Aware Zero-Shot Object Navigation Agent with Hierarchical Planning

Add code
May 09, 2025
Viaarxiv icon

Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation

Add code
Apr 21, 2025
Viaarxiv icon