Picture for Koichi Saito

Koichi Saito

Schrodinger Audio-Visual Editor: Object-Level Audiovisual Removal

Add code
Dec 14, 2025
Viaarxiv icon

FoleyBench: A Benchmark For Video-to-Audio Models

Add code
Nov 17, 2025
Viaarxiv icon

TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation

Add code
Oct 08, 2025
Viaarxiv icon

SoundReactor: Frame-level Online Video-to-Audio Generation

Add code
Oct 02, 2025
Figure 1 for SoundReactor: Frame-level Online Video-to-Audio Generation
Figure 2 for SoundReactor: Frame-level Online Video-to-Audio Generation
Figure 3 for SoundReactor: Frame-level Online Video-to-Audio Generation
Figure 4 for SoundReactor: Frame-level Online Video-to-Audio Generation
Viaarxiv icon

Music Arena: Live Evaluation for Text-to-Music

Add code
Jul 28, 2025
Viaarxiv icon

Schrödinger Bridge Consistency Trajectory Models for Speech Enhancement

Add code
Jul 16, 2025
Viaarxiv icon

Dyadic Mamba: Long-term Dyadic Human Motion Synthesis

Add code
May 14, 2025
Viaarxiv icon

Aligning Text-to-Music Evaluation with Human Preferences

Add code
Mar 20, 2025
Viaarxiv icon

VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music

Add code
Dec 23, 2024
Viaarxiv icon

DisMix: Disentangling Mixtures of Musical Instruments for Source-level Pitch and Timbre Manipulation

Add code
Aug 20, 2024
Figure 1 for DisMix: Disentangling Mixtures of Musical Instruments for Source-level Pitch and Timbre Manipulation
Figure 2 for DisMix: Disentangling Mixtures of Musical Instruments for Source-level Pitch and Timbre Manipulation
Figure 3 for DisMix: Disentangling Mixtures of Musical Instruments for Source-level Pitch and Timbre Manipulation
Figure 4 for DisMix: Disentangling Mixtures of Musical Instruments for Source-level Pitch and Timbre Manipulation
Viaarxiv icon