Picture for Yunhao Fang

Yunhao Fang

VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Add code
Sep 06, 2024
Figure 1 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Figure 2 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Figure 3 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Figure 4 for VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
Viaarxiv icon

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Add code
Aug 21, 2024
Viaarxiv icon

$VILA^2$: VILA Augmented VILA

Add code
Jul 24, 2024
Viaarxiv icon

PartSLIP++: Enhancing Low-Shot 3D Part Segmentation via Multi-View Instance Segmentation and Maximum Likelihood Estimation

Add code
Dec 05, 2023
Viaarxiv icon

Unleashing the Creative Mind: Language Model As Hierarchical Policy For Improved Exploration on Challenging Problem Solving

Add code
Nov 01, 2023
Viaarxiv icon

Distilling Large Vision-Language Model with Out-of-Distribution Generalizability

Add code
Jul 19, 2023
Viaarxiv icon

Deductive Verification of Chain-of-Thought Reasoning

Add code
Jun 07, 2023
Viaarxiv icon