Alert button
Picture for James Glass

James Glass

Alert button

Simple and Effective Unsupervised Speech Synthesis

Add code
Bookmark button
Alert button
Apr 20, 2022
Alexander H. Liu, Cheng-I Jeff Lai, Wei-Ning Hsu, Michael Auli, Alexei Baevski, James Glass

Figure 1 for Simple and Effective Unsupervised Speech Synthesis
Figure 2 for Simple and Effective Unsupervised Speech Synthesis
Figure 3 for Simple and Effective Unsupervised Speech Synthesis
Figure 4 for Simple and Effective Unsupervised Speech Synthesis
Viaarxiv icon

CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification

Add code
Bookmark button
Alert button
Mar 13, 2022
Yuan Gong, Sameer Khurana, Andrew Rouditchenko, James Glass

Figure 1 for CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Figure 2 for CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Figure 3 for CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Figure 4 for CMKD: CNN/Transformer-Based Cross-Model Knowledge Distillation for Audio Classification
Viaarxiv icon

Controlling the Focus of Pretrained Language Generation Models

Add code
Bookmark button
Alert button
Mar 02, 2022
Jiabao Ji, Yoon Kim, James Glass, Tianxing He

Figure 1 for Controlling the Focus of Pretrained Language Generation Models
Figure 2 for Controlling the Focus of Pretrained Language Generation Models
Figure 3 for Controlling the Focus of Pretrained Language Generation Models
Figure 4 for Controlling the Focus of Pretrained Language Generation Models
Viaarxiv icon

Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval

Add code
Bookmark button
Alert button
Dec 08, 2021
Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogerio Feris, David Harwath, James Glass, Hilde Kuehne

Figure 1 for Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval
Figure 2 for Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval
Figure 3 for Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval
Figure 4 for Everything at Once -- Multi-modal Fusion Transformer for Video Retrieval
Viaarxiv icon

Routing with Self-Attention for Multimodal Capsule Networks

Add code
Bookmark button
Alert button
Dec 01, 2021
Kevin Duarte, Brian Chen, Nina Shvetsova, Andrew Rouditchenko, Samuel Thomas, Alexander Liu, David Harwath, James Glass, Hilde Kuehne, Mubarak Shah

Figure 1 for Routing with Self-Attention for Multimodal Capsule Networks
Figure 2 for Routing with Self-Attention for Multimodal Capsule Networks
Figure 3 for Routing with Self-Attention for Multimodal Capsule Networks
Figure 4 for Routing with Self-Attention for Multimodal Capsule Networks
Viaarxiv icon

Cascaded Multilingual Audio-Visual Learning from Videos

Add code
Bookmark button
Alert button
Nov 08, 2021
Andrew Rouditchenko, Angie Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogerio Feris, Brian Kingsbury, Michael Picheny, James Glass

Figure 1 for Cascaded Multilingual Audio-Visual Learning from Videos
Figure 2 for Cascaded Multilingual Audio-Visual Learning from Videos
Figure 3 for Cascaded Multilingual Audio-Visual Learning from Videos
Figure 4 for Cascaded Multilingual Audio-Visual Learning from Videos
Viaarxiv icon

SSAST: Self-Supervised Audio Spectrogram Transformer

Add code
Bookmark button
Alert button
Oct 19, 2021
Yuan Gong, Cheng-I Jeff Lai, Yu-An Chung, James Glass

Figure 1 for SSAST: Self-Supervised Audio Spectrogram Transformer
Figure 2 for SSAST: Self-Supervised Audio Spectrogram Transformer
Figure 3 for SSAST: Self-Supervised Audio Spectrogram Transformer
Figure 4 for SSAST: Self-Supervised Audio Spectrogram Transformer
Viaarxiv icon

Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset

Add code
Bookmark button
Alert button
Oct 14, 2021
Ian Palmer, Andrew Rouditchenko, Andrei Barbu, Boris Katz, James Glass

Figure 1 for Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset
Figure 2 for Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset
Figure 3 for Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset
Figure 4 for Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset
Viaarxiv icon

Magic dust for cross-lingual adaptation of monolingual wav2vec-2.0

Add code
Bookmark button
Alert button
Oct 07, 2021
Sameer Khurana, Antoine Laurent, James Glass

Figure 1 for Magic dust for cross-lingual adaptation of monolingual wav2vec-2.0
Figure 2 for Magic dust for cross-lingual adaptation of monolingual wav2vec-2.0
Figure 3 for Magic dust for cross-lingual adaptation of monolingual wav2vec-2.0
Viaarxiv icon