Picture for Hang Xu

Hang Xu

KFFocus: Highlighting Keyframes for Enhanced Video Understanding

Add code
Aug 12, 2025
Viaarxiv icon

Direct Dual-Energy CT Material Decomposition using Model-based Denoising Diffusion Model

Add code
Jul 24, 2025
Viaarxiv icon

C2-Evo: Co-Evolving Multimodal Data and Model for Self-Improving Reasoning

Add code
Jul 22, 2025
Viaarxiv icon

ECCV 2024 W-CODA: 1st Workshop on Multimodal Perception and Comprehension of Corner Cases in Autonomous Driving

Add code
Jul 02, 2025
Viaarxiv icon

SViP: Sequencing Bimanual Visuomotor Policies with Object-Centric Motion Primitives

Add code
Jun 23, 2025
Viaarxiv icon

Adaptive Dropout: Unleashing Dropout across Layers for Generalizable Image Super-Resolution

Add code
Jun 15, 2025
Viaarxiv icon

Semantic-decoupled Spatial Partition Guided Point-supervised Oriented Object Detection

Add code
Jun 12, 2025
Viaarxiv icon

Does Your 3D Encoder Really Work? When Pretrain-SFT from 2D VLMs Meets 3D VLMs

Add code
Jun 06, 2025
Viaarxiv icon

SeePhys: Does Seeing Help Thinking? -- Benchmarking Vision-Based Physics Reasoning

Add code
May 25, 2025
Viaarxiv icon

CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback

Add code
Apr 28, 2025
Figure 1 for CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
Figure 2 for CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
Figure 3 for CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
Figure 4 for CoherenDream: Boosting Holistic Text Coherence in 3D Generation via Multimodal Large Language Models Feedback
Viaarxiv icon