Picture for Yun Fu

Yun Fu

Don't Judge by the Look: Towards Motion Coherent Video Representation

Add code
Mar 25, 2024
Figure 1 for Don't Judge by the Look: Towards Motion Coherent Video Representation
Figure 2 for Don't Judge by the Look: Towards Motion Coherent Video Representation
Figure 3 for Don't Judge by the Look: Towards Motion Coherent Video Representation
Figure 4 for Don't Judge by the Look: Towards Motion Coherent Video Representation
Viaarxiv icon

VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding

Add code
Dec 04, 2023
Figure 1 for VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding
Figure 2 for VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding
Figure 3 for VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding
Figure 4 for VaQuitA: Enhancing Alignment in LLM-Assisted Video Understanding
Viaarxiv icon

Exploring Question Decomposition for Zero-Shot VQA

Add code
Oct 25, 2023
Figure 1 for Exploring Question Decomposition for Zero-Shot VQA
Figure 2 for Exploring Question Decomposition for Zero-Shot VQA
Figure 3 for Exploring Question Decomposition for Zero-Shot VQA
Figure 4 for Exploring Question Decomposition for Zero-Shot VQA
Viaarxiv icon

Layout Sequence Prediction From Noisy Mobile Modality

Add code
Oct 09, 2023
Viaarxiv icon

Latent Graph Inference with Limited Supervision

Add code
Oct 06, 2023
Viaarxiv icon

Camouflaged Image Synthesis Is All You Need to Boost Camouflaged Detection

Add code
Aug 13, 2023
Figure 1 for Camouflaged Image Synthesis Is All You Need to Boost Camouflaged Detection
Figure 2 for Camouflaged Image Synthesis Is All You Need to Boost Camouflaged Detection
Figure 3 for Camouflaged Image Synthesis Is All You Need to Boost Camouflaged Detection
Figure 4 for Camouflaged Image Synthesis Is All You Need to Boost Camouflaged Detection
Viaarxiv icon

BEV-DG: Cross-Modal Learning under Bird's-Eye View for Domain Generalization of 3D Semantic Segmentation

Add code
Aug 12, 2023
Viaarxiv icon

Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!

Add code
Jun 06, 2023
Figure 1 for Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Figure 2 for Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Figure 3 for Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Figure 4 for Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Viaarxiv icon

SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds

Add code
Jun 03, 2023
Viaarxiv icon

UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild

Add code
May 25, 2023
Viaarxiv icon