Picture for Zhi Zhong

Zhi Zhong

Do Foundational Audio Encoders Understand Music Structure?

Add code
Dec 19, 2025
Viaarxiv icon

FoleyBench: A Benchmark For Video-to-Audio Models

Add code
Nov 17, 2025
Figure 1 for FoleyBench: A Benchmark For Video-to-Audio Models
Figure 2 for FoleyBench: A Benchmark For Video-to-Audio Models
Figure 3 for FoleyBench: A Benchmark For Video-to-Audio Models
Figure 4 for FoleyBench: A Benchmark For Video-to-Audio Models
Viaarxiv icon

SoundReactor: Frame-level Online Video-to-Audio Generation

Add code
Oct 02, 2025
Figure 1 for SoundReactor: Frame-level Online Video-to-Audio Generation
Figure 2 for SoundReactor: Frame-level Online Video-to-Audio Generation
Figure 3 for SoundReactor: Frame-level Online Video-to-Audio Generation
Figure 4 for SoundReactor: Frame-level Online Video-to-Audio Generation
Viaarxiv icon

TITAN-Guide: Taming Inference-Time AligNment for Guided Text-to-Video Diffusion Models

Add code
Aug 01, 2025
Viaarxiv icon

SpecMaskFoley: Steering Pretrained Spectral Masked Generative Transformer Toward Synchronized Video-to-audio Synthesis via ControlNet

Add code
May 22, 2025
Viaarxiv icon

Cross-Modal Learning for Music-to-Music-Video Description Generation

Add code
Mar 14, 2025
Viaarxiv icon

Music Foundation Model as Generic Booster for Music Downstream Tasks

Add code
Nov 05, 2024
Figure 1 for Music Foundation Model as Generic Booster for Music Downstream Tasks
Figure 2 for Music Foundation Model as Generic Booster for Music Downstream Tasks
Figure 3 for Music Foundation Model as Generic Booster for Music Downstream Tasks
Figure 4 for Music Foundation Model as Generic Booster for Music Downstream Tasks
Viaarxiv icon

OpenMU: Your Swiss Army Knife for Music Understanding

Add code
Oct 21, 2024
Figure 1 for OpenMU: Your Swiss Army Knife for Music Understanding
Figure 2 for OpenMU: Your Swiss Army Knife for Music Understanding
Figure 3 for OpenMU: Your Swiss Army Knife for Music Understanding
Figure 4 for OpenMU: Your Swiss Army Knife for Music Understanding
Viaarxiv icon

VRVQ: Variable Bitrate Residual Vector Quantization for Audio Compression

Add code
Oct 12, 2024
Viaarxiv icon

Variable Bitrate Residual Vector Quantization for Audio Coding

Add code
Oct 08, 2024
Viaarxiv icon