Picture for Takashi Shibuya

Takashi Shibuya

Human-Feedback Efficient Reinforcement Learning for Online Diffusion Model Finetuning

Add code
Oct 07, 2024
Viaarxiv icon

Embedded Topic Models Enhanced by Wikification

Add code
Oct 03, 2024
Viaarxiv icon

A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation

Add code
Sep 26, 2024
Figure 1 for A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Figure 2 for A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Figure 3 for A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Figure 4 for A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Viaarxiv icon

SpecMaskGIT: Masked Generative Modeling of Audio Spectrograms for Efficient Audio Synthesis and Beyond

Add code
Jun 26, 2024
Viaarxiv icon

MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training

Add code
Jun 04, 2024
Figure 1 for MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
Figure 2 for MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
Figure 3 for MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
Figure 4 for MoLA: Motion Generation and Editing with Latent Diffusion Enhanced by Adversarial Training
Viaarxiv icon

Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation

Add code
May 28, 2024
Figure 1 for Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
Figure 2 for Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
Figure 3 for Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
Figure 4 for Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
Viaarxiv icon

SoundCTM: Uniting Score-based and Consistency Models for Text-to-Sound Generation

Add code
May 28, 2024
Viaarxiv icon

GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping

Add code
May 27, 2024
Figure 1 for GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping
Figure 2 for GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping
Figure 3 for GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping
Figure 4 for GenWarp: Single Image to Novel Views with Semantic-Preserving Generative Warping
Viaarxiv icon

Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation

Add code
May 23, 2024
Viaarxiv icon

HQ-VAE: Hierarchical Discrete Representation Learning with Variational Bayes

Add code
Dec 31, 2023
Viaarxiv icon