Picture for Kazuhiro Saito

Kazuhiro Saito

Language-Guided Contrastive Audio-Visual Masked Autoencoder with Automatically Generated Audio-Visual-Text Triplets from Videos

Add code
Jul 16, 2025
Viaarxiv icon