Picture for Yuchi Wang

Yuchi Wang

Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement

Add code
Jun 12, 2024
Figure 1 for Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement
Figure 2 for Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement
Figure 3 for Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement
Figure 4 for Make Your Actor Talk: Generalizable and High-Fidelity Lip Sync with Motion and Appearance Disentanglement
Viaarxiv icon

InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation

Add code
May 24, 2024
Figure 1 for InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation
Figure 2 for InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation
Figure 3 for InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation
Figure 4 for InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation
Viaarxiv icon

LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?

Add code
Apr 16, 2024
Viaarxiv icon

UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing

Add code
Feb 24, 2024
Figure 1 for UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing
Figure 2 for UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing
Figure 3 for UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing
Figure 4 for UniEdit: A Unified Tuning-Free Framework for Video Motion and Appearance Editing
Viaarxiv icon

PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain

Add code
Feb 21, 2024
Figure 1 for PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
Figure 2 for PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
Figure 3 for PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
Figure 4 for PCA-Bench: Evaluating Multimodal Large Language Models in Perception-Cognition-Action Chain
Viaarxiv icon

GAIA: Zero-shot Talking Avatar Generation

Add code
Nov 26, 2023
Figure 1 for GAIA: Zero-shot Talking Avatar Generation
Figure 2 for GAIA: Zero-shot Talking Avatar Generation
Figure 3 for GAIA: Zero-shot Talking Avatar Generation
Figure 4 for GAIA: Zero-shot Talking Avatar Generation
Viaarxiv icon

Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond

Add code
Oct 16, 2023
Figure 1 for Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond
Figure 2 for Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond
Figure 3 for Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond
Figure 4 for Towards End-to-End Embodied Decision Making via Multi-modal Large Language Model: Explorations with GPT4-Vision and Beyond
Viaarxiv icon