Picture for David Harwath

David Harwath

Audio-Visual Neural Syntax Acquisition

Add code
Oct 11, 2023
Figure 1 for Audio-Visual Neural Syntax Acquisition
Figure 2 for Audio-Visual Neural Syntax Acquisition
Figure 3 for Audio-Visual Neural Syntax Acquisition
Figure 4 for Audio-Visual Neural Syntax Acquisition
Viaarxiv icon

AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models

Add code
Sep 19, 2023
Figure 1 for AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models
Figure 2 for AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models
Figure 3 for AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models
Viaarxiv icon

Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos

Add code
Jun 27, 2023
Figure 1 for Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos
Figure 2 for Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos
Figure 3 for Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos
Figure 4 for Style-transfer based Speech and Audio-visual Scene Understanding for Robot Action Sequence Acquisition from Videos
Viaarxiv icon

When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants

Add code
Jun 14, 2023
Figure 1 for When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants
Figure 2 for When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants
Figure 3 for When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants
Figure 4 for When to Use Efficient Self Attention? Profiling Text, Speech and Image Transformer Variants
Viaarxiv icon

Unit-based Speech-to-Speech Translation Without Parallel Data

Add code
May 24, 2023
Figure 1 for Unit-based Speech-to-Speech Translation Without Parallel Data
Figure 2 for Unit-based Speech-to-Speech Translation Without Parallel Data
Figure 3 for Unit-based Speech-to-Speech Translation Without Parallel Data
Figure 4 for Unit-based Speech-to-Speech Translation Without Parallel Data
Viaarxiv icon

Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages

Add code
May 21, 2023
Figure 1 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Figure 2 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Figure 3 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Figure 4 for Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages
Viaarxiv icon

Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode

Add code
May 19, 2023
Figure 1 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Figure 2 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Figure 3 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Figure 4 for Syllable Discovery and Cross-Lingual Generalization in a Visually Grounded, Self-Supervised Speech Mode
Viaarxiv icon

Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization

Add code
May 18, 2023
Figure 1 for Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization
Figure 2 for Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization
Figure 3 for Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization
Figure 4 for Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization
Viaarxiv icon

Continual Learning for On-Device Speech Recognition using Disentangled Conformers

Add code
Dec 13, 2022
Figure 1 for Continual Learning for On-Device Speech Recognition using Disentangled Conformers
Figure 2 for Continual Learning for On-Device Speech Recognition using Disentangled Conformers
Figure 3 for Continual Learning for On-Device Speech Recognition using Disentangled Conformers
Figure 4 for Continual Learning for On-Device Speech Recognition using Disentangled Conformers
Viaarxiv icon

Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models

Add code
Dec 03, 2022
Figure 1 for Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models
Figure 2 for Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models
Figure 3 for Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models
Figure 4 for Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models
Viaarxiv icon