Picture for Tingyu Weng

Tingyu Weng

Aligned Better, Listen Better for Audio-Visual Large Language Models

Add code
Apr 02, 2025
Figure 1 for Aligned Better, Listen Better for Audio-Visual Large Language Models
Figure 2 for Aligned Better, Listen Better for Audio-Visual Large Language Models
Figure 3 for Aligned Better, Listen Better for Audio-Visual Large Language Models
Figure 4 for Aligned Better, Listen Better for Audio-Visual Large Language Models
Viaarxiv icon

Wan: Open and Advanced Large-Scale Video Generative Models

Add code
Mar 26, 2025
Figure 1 for Wan: Open and Advanced Large-Scale Video Generative Models
Figure 2 for Wan: Open and Advanced Large-Scale Video Generative Models
Figure 3 for Wan: Open and Advanced Large-Scale Video Generative Models
Figure 4 for Wan: Open and Advanced Large-Scale Video Generative Models
Viaarxiv icon

UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface

Add code
Mar 04, 2025
Figure 1 for UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
Figure 2 for UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
Figure 3 for UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
Figure 4 for UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface
Viaarxiv icon