Picture for Masato Ishii

Masato Ishii

Mining Your Own Secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models

Add code
Oct 02, 2024
Figure 1 for Mining Your Own Secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models
Figure 2 for Mining Your Own Secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models
Figure 3 for Mining Your Own Secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models
Figure 4 for Mining Your Own Secrets: Diffusion Classifier Scores for Continual Personalization of Text-to-Image Diffusion Models
Viaarxiv icon

A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation

Add code
Sep 26, 2024
Figure 1 for A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Figure 2 for A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Figure 3 for A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Figure 4 for A Simple but Strong Baseline for Sounding Video Generation: Effective Adaptation of Audio and Video Diffusion Models for Joint Generation
Viaarxiv icon

Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation

Add code
May 28, 2024
Figure 1 for Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
Figure 2 for Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
Figure 3 for Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
Figure 4 for Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation
Viaarxiv icon

Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation

Add code
May 23, 2024
Figure 1 for Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Figure 2 for Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Figure 3 for Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Figure 4 for Visual Echoes: A Simple Unified Transformer for Audio-Visual Generation
Viaarxiv icon

Instruct 3D-to-3D: Text Instruction Guided 3D-to-3D conversion

Add code
Mar 28, 2023
Viaarxiv icon

DetOFA: Efficient Training of Once-for-All Networks for Object Detection by Using Pre-trained Supernet and Path Filter

Add code
Mar 23, 2023
Figure 1 for DetOFA: Efficient Training of Once-for-All Networks for Object Detection by Using Pre-trained Supernet and Path Filter
Figure 2 for DetOFA: Efficient Training of Once-for-All Networks for Object Detection by Using Pre-trained Supernet and Path Filter
Figure 3 for DetOFA: Efficient Training of Once-for-All Networks for Object Detection by Using Pre-trained Supernet and Path Filter
Figure 4 for DetOFA: Efficient Training of Once-for-All Networks for Object Detection by Using Pre-trained Supernet and Path Filter
Viaarxiv icon

Fine-grained Image Editing by Pixel-wise Guidance Using Diffusion Models

Add code
Dec 08, 2022
Viaarxiv icon

Semi-supervised learning by selective training with pseudo labels via confidence estimation

Add code
Mar 15, 2021
Figure 1 for Semi-supervised learning by selective training with pseudo labels via confidence estimation
Figure 2 for Semi-supervised learning by selective training with pseudo labels via confidence estimation
Figure 3 for Semi-supervised learning by selective training with pseudo labels via confidence estimation
Figure 4 for Semi-supervised learning by selective training with pseudo labels via confidence estimation
Viaarxiv icon

Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision

Add code
Mar 06, 2021
Figure 1 for Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision
Figure 2 for Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision
Figure 3 for Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision
Figure 4 for Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision
Viaarxiv icon

Neural Network Libraries: A Deep Learning Framework Designed from Engineers' Perspectives

Add code
Feb 12, 2021
Figure 1 for Neural Network Libraries: A Deep Learning Framework Designed from Engineers' Perspectives
Figure 2 for Neural Network Libraries: A Deep Learning Framework Designed from Engineers' Perspectives
Figure 3 for Neural Network Libraries: A Deep Learning Framework Designed from Engineers' Perspectives
Figure 4 for Neural Network Libraries: A Deep Learning Framework Designed from Engineers' Perspectives
Viaarxiv icon