Picture for David Harwath

David Harwath

FacEDiT: Unified Talking Face Editing and Generation via Facial Motion Infilling

Add code
Dec 16, 2025
Viaarxiv icon

VoiceCraft-X: Unifying Multilingual, Voice-Cloning Speech Synthesis and Speech Editing

Add code
Nov 15, 2025
Viaarxiv icon

Unifying Model and Layer Fusion for Speech Foundation Models

Add code
Nov 11, 2025
Viaarxiv icon

Probing the Robustness Properties of Neural Speech Codecs

Add code
May 30, 2025
Viaarxiv icon

VoiceStar: Robust Zero-Shot Autoregressive TTS with Duration Control and Extrapolation

Add code
May 26, 2025
Viaarxiv icon

Rhapsody: A Dataset for Highlight Detection in Podcasts

Add code
May 26, 2025
Viaarxiv icon

VoiceCraft-Dub: Automated Video Dubbing with Neural Codec Language Models

Add code
Apr 03, 2025
Viaarxiv icon

Scaling Rich Style-Prompted Text-to-Speech Datasets

Add code
Mar 06, 2025
Viaarxiv icon

How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario

Add code
Nov 27, 2024
Figure 1 for How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Figure 2 for How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Figure 3 for How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Figure 4 for How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario
Viaarxiv icon

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Add code
Nov 08, 2024
Figure 1 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 2 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 3 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Figure 4 for Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Viaarxiv icon