Picture for Xiangyang Xue

Xiangyang Xue

Fudan University

CoLa: Chinese Character Decomposition with Compositional Latent Components

Add code
Jun 04, 2025
Viaarxiv icon

ELA-ZSON: Efficient Layout-Aware Zero-Shot Object Navigation Agent with Hierarchical Planning

Add code
May 09, 2025
Viaarxiv icon

Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation

Add code
Apr 21, 2025
Figure 1 for Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation
Figure 2 for Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation
Figure 3 for Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation
Figure 4 for Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation
Viaarxiv icon

CAP-Net: A Unified Network for 6D Pose and Size Estimation of Categorical Articulated Parts from a Single RGB-D Image

Add code
Apr 15, 2025
Viaarxiv icon

DecoFuse: Decomposing and Fusing the "What", "Where", and "How" for Brain-Inspired fMRI-to-Video Decoding

Add code
Apr 01, 2025
Viaarxiv icon

ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning

Add code
Mar 30, 2025
Viaarxiv icon

EmoHead: Emotional Talking Head via Manipulating Semantic Expression Parameters

Add code
Mar 25, 2025
Figure 1 for EmoHead: Emotional Talking Head via Manipulating Semantic Expression Parameters
Figure 2 for EmoHead: Emotional Talking Head via Manipulating Semantic Expression Parameters
Figure 3 for EmoHead: Emotional Talking Head via Manipulating Semantic Expression Parameters
Figure 4 for EmoHead: Emotional Talking Head via Manipulating Semantic Expression Parameters
Viaarxiv icon

ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models

Add code
Feb 27, 2025
Figure 1 for ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models
Figure 2 for ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models
Figure 3 for ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models
Figure 4 for ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models
Viaarxiv icon

Global Semantic-Guided Sub-image Feature Weight Allocation in High-Resolution Large Vision-Language Models

Add code
Jan 24, 2025
Viaarxiv icon

SparseGrasp: Robotic Grasping via 3D Semantic Gaussian Splatting from Sparse Multi-View RGB Images

Add code
Dec 03, 2024
Viaarxiv icon