Picture for Yuki Mitsufuji

Yuki Mitsufuji

A Comprehensive Real-World Assessment of Audio Watermarking Algorithms: Will They Survive Neural Codecs?

Add code
May 26, 2025
Viaarxiv icon

SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet

Add code
May 22, 2025
Figure 1 for SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet
Figure 2 for SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet
Figure 3 for SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet
Figure 4 for SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet
Viaarxiv icon

Improving Inference-Time Optimisation for Vocal Effects Style Transfer with a Gaussian Prior

Add code
May 16, 2025
Viaarxiv icon

Dyadic Mamba: Long-term Dyadic Human Motion Synthesis

Add code
May 14, 2025
Viaarxiv icon

Forging and Removing Latent-Noise Diffusion Watermarks Using a Single Image

Add code
Apr 27, 2025
Viaarxiv icon

DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions

Add code
Apr 20, 2025
Figure 1 for DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions
Figure 2 for DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions
Figure 3 for DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions
Figure 4 for DiffVox: A Differentiable Model for Capturing and Analysing Professional Effects Distributions
Viaarxiv icon

SteerMusic: Enhanced Musical Consistency for Zero-shot Text-Guided and Personalized Music Editing

Add code
Apr 15, 2025
Viaarxiv icon

D^2USt3R: Enhancing 3D Reconstruction with 4D Pointmaps for Dynamic Scenes

Add code
Apr 08, 2025
Viaarxiv icon

CARE: Aligning Language Models for Regional Cultural Awareness

Add code
Apr 07, 2025
Figure 1 for CARE: Aligning Language Models for Regional Cultural Awareness
Figure 2 for CARE: Aligning Language Models for Regional Cultural Awareness
Figure 3 for CARE: Aligning Language Models for Regional Cultural Awareness
Figure 4 for CARE: Aligning Language Models for Regional Cultural Awareness
Viaarxiv icon

VinaBench: Benchmark for Faithful and Consistent Visual Narratives

Add code
Mar 26, 2025
Figure 1 for VinaBench: Benchmark for Faithful and Consistent Visual Narratives
Figure 2 for VinaBench: Benchmark for Faithful and Consistent Visual Narratives
Figure 3 for VinaBench: Benchmark for Faithful and Consistent Visual Narratives
Figure 4 for VinaBench: Benchmark for Faithful and Consistent Visual Narratives
Viaarxiv icon