Picture for David Harwath

David Harwath

FacEDiT: Unified Talking Face Editing and Generation via Facial Motion Infilling

Add code
Dec 16, 2025
Viaarxiv icon

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing

Add code
Nov 15, 2025
Viaarxiv icon

Unifying Model and Layer Fusion for Speech Foundation Models

Add code
Nov 11, 2025
Figure 1 for Unifying Model and Layer Fusion for Speech Foundation Models
Figure 2 for Unifying Model and Layer Fusion for Speech Foundation Models
Figure 3 for Unifying Model and Layer Fusion for Speech Foundation Models
Figure 4 for Unifying Model and Layer Fusion for Speech Foundation Models
Viaarxiv icon

Probing the Robustness Properties of Neural Speech Codecs

Add code
May 30, 2025
Viaarxiv icon

Rhapsody: A Dataset for Highlight Detection in Podcasts

Add code
May 26, 2025
Viaarxiv icon

VoiceStar: Robust Zero-Shot Autoregressive TTS with Duration Control and Extrapolation

Add code
May 26, 2025
Viaarxiv icon

VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models

Add code
Apr 03, 2025
Viaarxiv icon

Scaling Rich Style-Prompted Text-to-Speech Datasets

Add code
Mar 06, 2025
Viaarxiv icon

How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario

Add code
Nov 27, 2024
Figure 1 for How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Figure 2 for How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Figure 3 for How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Figure 4 for How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Viaarxiv icon

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Add code
Nov 08, 2024
Figure 1 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 2 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 3 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 4 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Viaarxiv icon