Picture for James Glass

James Glass

MIT Computer Science and Artificial Intelligence Laboratory, MA, USA

Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation

Add code
Jun 14, 2024
Figure 1 for Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
Figure 2 for Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
Figure 3 for Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
Figure 4 for Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation
Viaarxiv icon

THREAD: Thinking Deeper with Recursive Spawning

Add code
May 27, 2024
Figure 1 for THREAD: Thinking Deeper with Recursive Spawning
Figure 2 for THREAD: Thinking Deeper with Recursive Spawning
Figure 3 for THREAD: Thinking Deeper with Recursive Spawning
Figure 4 for THREAD: Thinking Deeper with Recursive Spawning
Viaarxiv icon

Curiosity-driven Red-teaming for Large Language Models

Add code
Feb 29, 2024
Figure 1 for Curiosity-driven Red-teaming for Large Language Models
Figure 2 for Curiosity-driven Red-teaming for Large Language Models
Figure 3 for Curiosity-driven Red-teaming for Large Language Models
Figure 4 for Curiosity-driven Red-teaming for Large Language Models
Viaarxiv icon

Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective

Add code
Jan 16, 2024
Figure 1 for Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective
Figure 2 for Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective
Figure 3 for Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective
Figure 4 for Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective
Viaarxiv icon

R-Spin: Efficient Speaker and Noise-invariant Representation Learning with Acoustic Pieces

Add code
Nov 15, 2023
Viaarxiv icon

Audio-Visual Neural Syntax Acquisition

Add code
Oct 11, 2023
Figure 1 for Audio-Visual Neural Syntax Acquisition
Figure 2 for Audio-Visual Neural Syntax Acquisition
Figure 3 for Audio-Visual Neural Syntax Acquisition
Figure 4 for Audio-Visual Neural Syntax Acquisition
Viaarxiv icon

Joint Audio and Speech Understanding

Add code
Oct 02, 2023
Figure 1 for Joint Audio and Speech Understanding
Figure 2 for Joint Audio and Speech Understanding
Figure 3 for Joint Audio and Speech Understanding
Figure 4 for Joint Audio and Speech Understanding
Viaarxiv icon

Self-Specialization: Uncovering Latent Expertise within Large Language Models

Add code
Sep 29, 2023
Figure 1 for Self-Specialization: Uncovering Latent Expertise within Large Language Models
Figure 2 for Self-Specialization: Uncovering Latent Expertise within Large Language Models
Figure 3 for Self-Specialization: Uncovering Latent Expertise within Large Language Models
Figure 4 for Self-Specialization: Uncovering Latent Expertise within Large Language Models
Viaarxiv icon

Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning

Add code
Sep 19, 2023
Viaarxiv icon

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Add code
Sep 07, 2023
Figure 1 for DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Figure 2 for DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Figure 3 for DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Figure 4 for DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models
Viaarxiv icon