Picture for David Harwath

David Harwath

Probing the Robustness Properties of Neural Speech Codecs

Add code
May 30, 2025
Viaarxiv icon

Rhapsody: A Dataset for Highlight Detection in Podcasts

Add code
May 26, 2025
Viaarxiv icon

VoiceStar: Robust Zero-Shot Autoregressive TTS with Duration Control and Extrapolation

Add code
May 26, 2025
Viaarxiv icon

VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models

Add code
Apr 03, 2025
Viaarxiv icon

Scaling Rich Style-Prompted Text-to-Speech Datasets

Add code
Mar 06, 2025
Viaarxiv icon

How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario

Add code
Nov 27, 2024
Figure 1 for How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Figure 2 for How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Figure 3 for How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Figure 4 for How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Viaarxiv icon

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Add code
Nov 08, 2024
Figure 1 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 2 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 3 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 4 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Viaarxiv icon

SyllableLM: Learning Coarse Semantic Units for Speech Language Models

Add code
Oct 05, 2024
Viaarxiv icon

Self-supervised Speech Models for Word-Level Stuttered Speech Detection

Add code
Sep 16, 2024
Figure 1 for Self-supervised Speech Models for Word-Level Stuttered Speech Detection
Figure 2 for Self-supervised Speech Models for Word-Level Stuttered Speech Detection
Figure 3 for Self-supervised Speech Models for Word-Level Stuttered Speech Detection
Figure 4 for Self-supervised Speech Models for Word-Level Stuttered Speech Detection
Viaarxiv icon

Interface Design for Self-Supervised Speech Models

Add code
Jun 18, 2024
Viaarxiv icon