Picture for Justin Salamon

Justin Salamon

FLAM: Frame-Wise Language-Audio Modeling

Add code
May 08, 2025
Viaarxiv icon

SILA: Signal-to-Language Augmentation for Enhanced Control in Text-to-Audio Generation

Add code
Dec 13, 2024
Viaarxiv icon

Sketch2Sound: Controllable Audio Generation via Time-Varying Signals and Sonic Imitations

Add code
Dec 11, 2024
Figure 1 for Sketch2Sound: Controllable Audio Generation via Time-Varying Signals and Sonic Imitations
Figure 2 for Sketch2Sound: Controllable Audio Generation via Time-Varying Signals and Sonic Imitations
Figure 3 for Sketch2Sound: Controllable Audio Generation via Time-Varying Signals and Sonic Imitations
Viaarxiv icon

Video-Guided Foley Sound Generation with Multimodal Controls

Add code
Nov 26, 2024
Viaarxiv icon

Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning

Add code
Sep 17, 2024
Figure 1 for Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning
Figure 2 for Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning
Figure 3 for Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning
Figure 4 for Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning
Viaarxiv icon

Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries

Add code
Aug 17, 2023
Figure 1 for Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries
Figure 2 for Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries
Figure 3 for Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries
Viaarxiv icon

Language-Guided Music Recommendation for Video via Prompt Analogies

Add code
Jun 15, 2023
Viaarxiv icon

Efficient Spoken Language Recognition via Multilabel Classification

Add code
Jun 02, 2023
Figure 1 for Efficient Spoken Language Recognition via Multilabel Classification
Figure 2 for Efficient Spoken Language Recognition via Multilabel Classification
Figure 3 for Efficient Spoken Language Recognition via Multilabel Classification
Figure 4 for Efficient Spoken Language Recognition via Multilabel Classification
Viaarxiv icon

Conditional Generation of Audio from Video via Foley Analogies

Add code
Apr 17, 2023
Viaarxiv icon

Language-Guided Audio-Visual Source Separation via Trimodal Consistency

Add code
Mar 28, 2023
Figure 1 for Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Figure 2 for Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Figure 3 for Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Figure 4 for Language-Guided Audio-Visual Source Separation via Trimodal Consistency
Viaarxiv icon