Alert button

Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions

May 10, 2021
Mathew Monfort, SouYoung Jin, Alexander Liu, David Harwath, Rogerio Feris, James Glass, Aude Oliva

Figure 1 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Figure 2 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Figure 3 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Figure 4 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: