Picture for Karen Livescu

Karen Livescu

Shammie

DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding

Add code
Jun 13, 2024
Figure 1 for DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Figure 2 for DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Figure 3 for DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Figure 4 for DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding
Viaarxiv icon

On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models

Add code
Jun 13, 2024
Viaarxiv icon

ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets

Add code
Jun 12, 2024
Figure 1 for ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets
Figure 2 for ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets
Figure 3 for ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets
Figure 4 for ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets
Viaarxiv icon

Self-Supervised Speech Representations are More Phonetic than Semantic

Add code
Jun 12, 2024
Figure 1 for Self-Supervised Speech Representations are More Phonetic than Semantic
Figure 2 for Self-Supervised Speech Representations are More Phonetic than Semantic
Figure 3 for Self-Supervised Speech Representations are More Phonetic than Semantic
Figure 4 for Self-Supervised Speech Representations are More Phonetic than Semantic
Viaarxiv icon

SignMusketeers: An Efficient Multi-Stream Approach for Sign Language Translation at Scale

Add code
Jun 11, 2024
Viaarxiv icon

Structured Tree Alignment for Evaluation of (Speech) Constituency Parsing

Add code
Feb 21, 2024
Figure 1 for Structured Tree Alignment for Evaluation of (Speech) Constituency Parsing
Figure 2 for Structured Tree Alignment for Evaluation of (Speech) Constituency Parsing
Figure 3 for Structured Tree Alignment for Evaluation of (Speech) Constituency Parsing
Figure 4 for Structured Tree Alignment for Evaluation of (Speech) Constituency Parsing
Viaarxiv icon

Generative Context-aware Fine-tuning of Self-supervised Speech Models

Add code
Dec 15, 2023
Figure 1 for Generative Context-aware Fine-tuning of Self-supervised Speech Models
Figure 2 for Generative Context-aware Fine-tuning of Self-supervised Speech Models
Figure 3 for Generative Context-aware Fine-tuning of Self-supervised Speech Models
Figure 4 for Generative Context-aware Fine-tuning of Self-supervised Speech Models
Viaarxiv icon

Toward Joint Language Modeling for Speech Units and Text

Add code
Oct 12, 2023
Figure 1 for Toward Joint Language Modeling for Speech Units and Text
Figure 2 for Toward Joint Language Modeling for Speech Units and Text
Figure 3 for Toward Joint Language Modeling for Speech Units and Text
Figure 4 for Toward Joint Language Modeling for Speech Units and Text
Viaarxiv icon

Audio-Visual Neural Syntax Acquisition

Add code
Oct 11, 2023
Figure 1 for Audio-Visual Neural Syntax Acquisition
Figure 2 for Audio-Visual Neural Syntax Acquisition
Figure 3 for Audio-Visual Neural Syntax Acquisition
Figure 4 for Audio-Visual Neural Syntax Acquisition
Viaarxiv icon

Few-Shot Spoken Language Understanding via Joint Speech-Text Models

Add code
Oct 09, 2023
Viaarxiv icon