Picture for Baining Guo

Baining Guo

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Add code
Apr 16, 2024
Figure 1 for VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Figure 2 for VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Figure 3 for VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Figure 4 for VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Viaarxiv icon

GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling

Add code
Apr 05, 2024
Figure 1 for GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling
Figure 2 for GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling
Figure 3 for GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling
Figure 4 for GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling
Viaarxiv icon

Simplified Diffusion Schrödinger Bridge

Add code
Mar 27, 2024
Figure 1 for Simplified Diffusion Schrödinger Bridge
Figure 2 for Simplified Diffusion Schrödinger Bridge
Figure 3 for Simplified Diffusion Schrödinger Bridge
Figure 4 for Simplified Diffusion Schrödinger Bridge
Viaarxiv icon

VisualCritic: Making LMMs Perceive Visual Quality Like Humans

Add code
Mar 19, 2024
Figure 1 for VisualCritic: Making LMMs Perceive Visual Quality Like Humans
Figure 2 for VisualCritic: Making LMMs Perceive Visual Quality Like Humans
Figure 3 for VisualCritic: Making LMMs Perceive Visual Quality Like Humans
Figure 4 for VisualCritic: Making LMMs Perceive Visual Quality Like Humans
Viaarxiv icon

RelationVLM: Making Large Vision-Language Models Understand Visual Relations

Add code
Mar 19, 2024
Figure 1 for RelationVLM: Making Large Vision-Language Models Understand Visual Relations
Figure 2 for RelationVLM: Making Large Vision-Language Models Understand Visual Relations
Figure 3 for RelationVLM: Making Large Vision-Language Models Understand Visual Relations
Figure 4 for RelationVLM: Making Large Vision-Language Models Understand Visual Relations
Viaarxiv icon

CCA: Collaborative Competitive Agents for Image Editing

Add code
Jan 23, 2024
Viaarxiv icon

VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder

Add code
Dec 18, 2023
Figure 1 for VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder
Figure 2 for VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder
Figure 3 for VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder
Figure 4 for VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder
Viaarxiv icon

MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation

Add code
Nov 30, 2023
Figure 1 for MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
Figure 2 for MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
Figure 3 for MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
Figure 4 for MicroCinema: A Divide-and-Conquer Approach for Text-to-Video Generation
Viaarxiv icon

COLE: A Hierarchical Generation Framework for Graphic Design

Add code
Nov 28, 2023
Figure 1 for COLE: A Hierarchical Generation Framework for Graphic Design
Figure 2 for COLE: A Hierarchical Generation Framework for Graphic Design
Figure 3 for COLE: A Hierarchical Generation Framework for Graphic Design
Figure 4 for COLE: A Hierarchical Generation Framework for Graphic Design
Viaarxiv icon

CCEdit: Creative and Controllable Video Editing via Diffusion Models

Sep 28, 2023
Figure 1 for CCEdit: Creative and Controllable Video Editing via Diffusion Models
Figure 2 for CCEdit: Creative and Controllable Video Editing via Diffusion Models
Figure 3 for CCEdit: Creative and Controllable Video Editing via Diffusion Models
Figure 4 for CCEdit: Creative and Controllable Video Editing via Diffusion Models
Viaarxiv icon