Picture for William Chen

William Chen

Towards Robust Speech Representation Learning for Thousands of Languages

Add code
Jul 02, 2024
Viaarxiv icon

Nollywood: Let's Go to the Movies!

Add code
Jul 02, 2024
Viaarxiv icon

On the Evaluation of Speech Foundation Models for Spoken Language Understanding

Add code
Jun 14, 2024
Viaarxiv icon

On the Effects of Heterogeneous Data Sources on Speech-to-Text Foundation Models

Add code
Jun 13, 2024
Viaarxiv icon

ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets

Add code
Jun 12, 2024
Viaarxiv icon

YODAS: Youtube-Oriented Dataset for Audio and Speech

Add code
Jun 02, 2024
Figure 1 for YODAS: Youtube-Oriented Dataset for Audio and Speech
Figure 2 for YODAS: Youtube-Oriented Dataset for Audio and Speech
Figure 3 for YODAS: Youtube-Oriented Dataset for Audio and Speech
Figure 4 for YODAS: Youtube-Oriented Dataset for Audio and Speech
Viaarxiv icon

Vision-Language Models Provide Promptable Representations for Reinforcement Learning

Add code
Feb 13, 2024
Viaarxiv icon

OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer

Add code
Jan 30, 2024
Figure 1 for OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Figure 2 for OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Figure 3 for OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Figure 4 for OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer
Viaarxiv icon

AugSumm: towards generalizable speech summarization using synthetic labels from large language model

Add code
Jan 10, 2024
Viaarxiv icon

Indoor and Outdoor 3D Scene Graph Generation via Language-Enabled Spatial Ontologies

Add code
Dec 18, 2023
Viaarxiv icon