Picture for James Glass

James Glass

MIT Computer Science and Artificial Intelligence Laboratory, MA, USA

Overflow Prevention Enhances Long-Context Recurrent LLMs

Add code
May 12, 2025
Viaarxiv icon

PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play

Add code
Mar 18, 2025
Figure 1 for PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play
Figure 2 for PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play
Figure 3 for PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play
Figure 4 for PLAY2PROMPT: Zero-shot Tool Instruction Optimization for LLM Agents via Tool Play
Viaarxiv icon

Generate, Discriminate, Evolve: Enhancing Context Faithfulness via Fine-Grained Sentence-Level Self-Evolution

Add code
Mar 03, 2025
Figure 1 for Generate, Discriminate, Evolve: Enhancing Context Faithfulness via Fine-Grained Sentence-Level Self-Evolution
Figure 2 for Generate, Discriminate, Evolve: Enhancing Context Faithfulness via Fine-Grained Sentence-Level Self-Evolution
Figure 3 for Generate, Discriminate, Evolve: Enhancing Context Faithfulness via Fine-Grained Sentence-Level Self-Evolution
Figure 4 for Generate, Discriminate, Evolve: Enhancing Context Faithfulness via Fine-Grained Sentence-Level Self-Evolution
Viaarxiv icon

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Add code
Feb 13, 2025
Figure 1 for SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
Figure 2 for SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
Figure 3 for SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
Figure 4 for SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models
Viaarxiv icon

mWhisper-Flamingo for Multilingual Audio-Visual Noise-Robust Speech Recognition

Add code
Feb 03, 2025
Viaarxiv icon

State-Space Large Audio Language Models

Add code
Nov 24, 2024
Figure 1 for State-Space Large Audio Language Models
Figure 2 for State-Space Large Audio Language Models
Figure 3 for State-Space Large Audio Language Models
Viaarxiv icon

Teaching VLMs to Localize Specific Objects from In-context Examples

Add code
Nov 20, 2024
Figure 1 for Teaching VLMs to Localize Specific Objects from In-context Examples
Figure 2 for Teaching VLMs to Localize Specific Objects from In-context Examples
Figure 3 for Teaching VLMs to Localize Specific Objects from In-context Examples
Figure 4 for Teaching VLMs to Localize Specific Objects from In-context Examples
Viaarxiv icon

DC-Spin: A Speaker-invariant Speech Tokenizer for Spoken Language Models

Add code
Oct 31, 2024
Viaarxiv icon

A Closer Look at Neural Codec Resynthesis: Bridging the Gap between Codec and Waveform Generation

Add code
Oct 29, 2024
Figure 1 for A Closer Look at Neural Codec Resynthesis: Bridging the Gap between Codec and Waveform Generation
Figure 2 for A Closer Look at Neural Codec Resynthesis: Bridging the Gap between Codec and Waveform Generation
Figure 3 for A Closer Look at Neural Codec Resynthesis: Bridging the Gap between Codec and Waveform Generation
Viaarxiv icon

Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback

Add code
Oct 28, 2024
Figure 1 for Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback
Figure 2 for Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback
Figure 3 for Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback
Figure 4 for Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback
Viaarxiv icon