Picture for Hang Xu

Hang Xu

ECCV 2024 W-CODA: 1st Workshop on Multimodal Perception and Comprehension of Corner Cases in Autonomous Driving

Add code
Jul 02, 2025
Viaarxiv icon

SViP: Sequencing Bimanual Visuomotor Policies with Object-Centric Motion Primitives

Add code
Jun 23, 2025
Viaarxiv icon

Adaptive Dropout: Unleashing Dropout across Layers for Generalizable Image Super-Resolution

Add code
Jun 15, 2025
Viaarxiv icon

Semantic-decoupled Spatial Partition Guided Point-supervised Oriented Object Detection

Add code
Jun 12, 2025
Viaarxiv icon

Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs

Add code
Jun 06, 2025
Viaarxiv icon

SeePhys: Does Seeing Help Thinking? -- Benchmarking Vision-Based Physics Reasoning

Add code
May 25, 2025
Viaarxiv icon

CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback

Add code
Apr 28, 2025
Viaarxiv icon

PaMi-VDPO: Mitigating Video Hallucinations by Prompt-Aware Multi-Instance Video Preference Learning

Add code
Apr 08, 2025
Viaarxiv icon

ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement

Add code
Apr 03, 2025
Viaarxiv icon

From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D

Add code
Mar 29, 2025
Viaarxiv icon