Picture for Yuki Mitsufuji

Yuki Mitsufuji

TITAN-Guide: Taming Inference-Time AligNment for Guided Text-to-Video Diffusion Models

Add code
Aug 01, 2025
Viaarxiv icon

Music Arena: Live Evaluation for Text-to-Music

Add code
Jul 28, 2025
Viaarxiv icon

Schrödinger Bridge Consistency Trajectory Models for Speech Enhancement

Add code
Jul 16, 2025
Viaarxiv icon

Concept-TRAK: Understanding how diffusion models learn concepts through concept-level attribution

Add code
Jul 09, 2025
Viaarxiv icon

Denoising Multi-Beta VAE: Representation Learning for Disentanglement and Generation

Add code
Jul 09, 2025
Viaarxiv icon

Fx-Encoder++: Extracting Instrument-Wise Audio Effects Representations from Mixtures

Add code
Jul 03, 2025
Viaarxiv icon

Step-by-Step Video-to-Audio Synthesis via Negative Audio Guidance

Add code
Jun 26, 2025
Viaarxiv icon

Vid-CamEdit: Video Camera Trajectory Editing with Generative Rendering from Estimated Geometry

Add code
Jun 16, 2025
Viaarxiv icon

Can Large Language Models Predict Audio Effects Parameters from Natural Language?

Add code
May 27, 2025
Viaarxiv icon

A Comprehensive Real-World Assessment of Audio Watermarking Algorithms: Will They Survive Neural Codecs?

Add code
May 26, 2025
Viaarxiv icon