Picture for Yuan Gong

Yuan Gong

Joint Audio and Speech Understanding

Add code
Oct 02, 2023
Figure 1 for Joint Audio and Speech Understanding
Figure 2 for Joint Audio and Speech Understanding
Figure 3 for Joint Audio and Speech Understanding
Figure 4 for Joint Audio and Speech Understanding
Viaarxiv icon

Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning

Add code
Sep 19, 2023
Viaarxiv icon

ToonTalker: Cross-Domain Face Reenactment

Add code
Aug 24, 2023
Viaarxiv icon

Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation

Add code
Jul 13, 2023
Figure 1 for Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
Figure 2 for Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
Figure 3 for Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
Figure 4 for Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation
Viaarxiv icon

Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers

Add code
Jul 06, 2023
Figure 1 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 2 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 3 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 4 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Viaarxiv icon

TaleCrafter: Interactive Story Visualization with Multiple Characters

Add code
May 30, 2023
Figure 1 for TaleCrafter: Interactive Story Visualization with Multiple Characters
Figure 2 for TaleCrafter: Interactive Story Visualization with Multiple Characters
Figure 3 for TaleCrafter: Interactive Story Visualization with Multiple Characters
Figure 4 for TaleCrafter: Interactive Story Visualization with Multiple Characters
Viaarxiv icon

SAIL: Search-Augmented Instruction Learning

Add code
May 24, 2023
Viaarxiv icon

Listen, Think, and Understand

Add code
May 18, 2023
Viaarxiv icon

3D GAN Inversion with Facial Symmetry Prior

Add code
Nov 30, 2022
Viaarxiv icon

MAP: Modality-Agnostic Uncertainty-Aware Vision-Language Pre-training Model

Add code
Oct 11, 2022
Figure 1 for MAP: Modality-Agnostic Uncertainty-Aware Vision-Language Pre-training Model
Figure 2 for MAP: Modality-Agnostic Uncertainty-Aware Vision-Language Pre-training Model
Figure 3 for MAP: Modality-Agnostic Uncertainty-Aware Vision-Language Pre-training Model
Figure 4 for MAP: Modality-Agnostic Uncertainty-Aware Vision-Language Pre-training Model
Viaarxiv icon