Alert button
Picture for David Harwath

David Harwath

Alert button

Learning Audio-Visual Dereverberation

Add code
Bookmark button
Alert button
Jun 14, 2021
Changan Chen, Wei Sun, David Harwath, Kristen Grauman

Figure 1 for Learning Audio-Visual Dereverberation
Figure 2 for Learning Audio-Visual Dereverberation
Figure 3 for Learning Audio-Visual Dereverberation
Figure 4 for Learning Audio-Visual Dereverberation
Viaarxiv icon

Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions

Add code
Bookmark button
Alert button
May 10, 2021
Mathew Monfort, SouYoung Jin, Alexander Liu, David Harwath, Rogerio Feris, James Glass, Aude Oliva

Figure 1 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Figure 2 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Figure 3 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Figure 4 for Spoken Moments: Learning Joint Audio-Visual Representations from Video Descriptions
Viaarxiv icon

Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos

Add code
Bookmark button
Alert button
May 05, 2021
Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie Boggust, Rameswar Panda, Brian Kingsbury, Rogerio Feris, David Harwath, James Glass, Michael Picheny, Shih-Fu Chang

Figure 1 for Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Figure 2 for Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Figure 3 for Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Figure 4 for Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos
Viaarxiv icon

Text-Free Image-to-Speech Synthesis Using Learned Segmental Units

Add code
Bookmark button
Alert button
Dec 31, 2020
Wei-Ning Hsu, David Harwath, Christopher Song, James Glass

Figure 1 for Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Figure 2 for Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Figure 3 for Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Figure 4 for Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Viaarxiv icon

AVLnet: Learning Audio-Visual Language Representations from Instructional Videos

Add code
Bookmark button
Alert button
Jun 16, 2020
Andrew Rouditchenko, Angie Boggust, David Harwath, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Rogerio Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James Glass

Figure 1 for AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Figure 2 for AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Figure 3 for AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Figure 4 for AVLnet: Learning Audio-Visual Language Representations from Instructional Videos
Viaarxiv icon

Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech

Add code
Bookmark button
Alert button
Nov 21, 2019
David Harwath, Wei-Ning Hsu, James Glass

Figure 1 for Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech
Figure 2 for Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech
Figure 3 for Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech
Figure 4 for Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech
Viaarxiv icon

Transfer Learning from Audio-Visual Grounding to Speech Recognition

Add code
Bookmark button
Alert button
Jul 09, 2019
Wei-Ning Hsu, David Harwath, James Glass

Figure 1 for Transfer Learning from Audio-Visual Grounding to Speech Recognition
Figure 2 for Transfer Learning from Audio-Visual Grounding to Speech Recognition
Figure 3 for Transfer Learning from Audio-Visual Grounding to Speech Recognition
Figure 4 for Transfer Learning from Audio-Visual Grounding to Speech Recognition
Viaarxiv icon

Towards Visually Grounded Sub-Word Speech Unit Discovery

Add code
Bookmark button
Alert button
Feb 21, 2019
David Harwath, James Glass

Figure 1 for Towards Visually Grounded Sub-Word Speech Unit Discovery
Figure 2 for Towards Visually Grounded Sub-Word Speech Unit Discovery
Figure 3 for Towards Visually Grounded Sub-Word Speech Unit Discovery
Figure 4 for Towards Visually Grounded Sub-Word Speech Unit Discovery
Viaarxiv icon

Vision as an Interlingua: Learning Multilingual Semantic Embeddings of Untranscribed Speech

Add code
Bookmark button
Alert button
Apr 09, 2018
David Harwath, Galen Chuang, James Glass

Figure 1 for Vision as an Interlingua: Learning Multilingual Semantic Embeddings of Untranscribed Speech
Figure 2 for Vision as an Interlingua: Learning Multilingual Semantic Embeddings of Untranscribed Speech
Figure 3 for Vision as an Interlingua: Learning Multilingual Semantic Embeddings of Untranscribed Speech
Viaarxiv icon