Picture for Hang Xu

Hang Xu

JointDreamer: Ensuring Geometry Consistency and Text Congruence in Text-to-3D Generation via Joint Score Distillation

Add code
Jul 17, 2024
Viaarxiv icon

HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models

Add code
Jul 11, 2024
Figure 1 for HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
Figure 2 for HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
Figure 3 for HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
Figure 4 for HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models
Viaarxiv icon

HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-fine Pose-Reversible Guidance

Add code
Jul 09, 2024
Viaarxiv icon

Explicitly Guided Information Interaction Network for Cross-modal Point Cloud Completion

Add code
Jul 03, 2024
Viaarxiv icon

BiKC: Keypose-Conditioned Consistency Policy for Bimanual Robotic Manipulation

Add code
Jun 14, 2024
Figure 1 for BiKC: Keypose-Conditioned Consistency Policy for Bimanual Robotic Manipulation
Figure 2 for BiKC: Keypose-Conditioned Consistency Policy for Bimanual Robotic Manipulation
Figure 3 for BiKC: Keypose-Conditioned Consistency Policy for Bimanual Robotic Manipulation
Figure 4 for BiKC: Keypose-Conditioned Consistency Policy for Bimanual Robotic Manipulation
Viaarxiv icon

AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding

Add code
Jun 11, 2024
Figure 1 for AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
Figure 2 for AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
Figure 3 for AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
Figure 4 for AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
Viaarxiv icon

Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection

Add code
Jun 02, 2024
Viaarxiv icon

Correctable Landmark Discovery via Large Models for Vision-Language Navigation

Add code
May 29, 2024
Figure 1 for Correctable Landmark Discovery via Large Models for Vision-Language Navigation
Figure 2 for Correctable Landmark Discovery via Large Models for Vision-Language Navigation
Figure 3 for Correctable Landmark Discovery via Large Models for Vision-Language Navigation
Figure 4 for Correctable Landmark Discovery via Large Models for Vision-Language Navigation
Viaarxiv icon

LaneCorrect: Self-supervised Lane Detection

Add code
Apr 23, 2024
Figure 1 for LaneCorrect: Self-supervised Lane Detection
Figure 2 for LaneCorrect: Self-supervised Lane Detection
Figure 3 for LaneCorrect: Self-supervised Lane Detection
Figure 4 for LaneCorrect: Self-supervised Lane Detection
Viaarxiv icon

Minimizing Weighted Counterfactual Regret with Optimistic Online Mirror Descent

Add code
Apr 22, 2024
Figure 1 for Minimizing Weighted Counterfactual Regret with Optimistic Online Mirror Descent
Figure 2 for Minimizing Weighted Counterfactual Regret with Optimistic Online Mirror Descent
Figure 3 for Minimizing Weighted Counterfactual Regret with Optimistic Online Mirror Descent
Figure 4 for Minimizing Weighted Counterfactual Regret with Optimistic Online Mirror Descent
Viaarxiv icon