Picture for Shinji Watanabe

Shinji Watanabe

Carnegie Mellon University

ARECHO: Autoregressive Evaluation via Chain-Based Hypothesis Optimization for Speech Multi-Metric Estimation

Add code
May 30, 2025
Viaarxiv icon

Interspeech 2025 URGENT Speech Enhancement Challenge

Add code
May 29, 2025
Viaarxiv icon

Uni-VERSA: Versatile Speech Assessment with a Unified Network

Add code
May 27, 2025
Viaarxiv icon

Context-Driven Dynamic Pruning for Large Speech Foundation Models

Add code
May 24, 2025
Figure 1 for Context-Driven Dynamic Pruning for Large Speech Foundation Models
Figure 2 for Context-Driven Dynamic Pruning for Large Speech Foundation Models
Figure 3 for Context-Driven Dynamic Pruning for Large Speech Foundation Models
Figure 4 for Context-Driven Dynamic Pruning for Large Speech Foundation Models
Viaarxiv icon

Differentiable K-means for Fully-optimized Discrete Token-based ASR

Add code
May 22, 2025
Viaarxiv icon

BLAB: Brutally Long Audio Bench

Add code
May 05, 2025
Viaarxiv icon

On The Landscape of Spoken Language Models: A Comprehensive Survey

Add code
Apr 11, 2025
Figure 1 for On The Landscape of Spoken Language Models: A Comprehensive Survey
Figure 2 for On The Landscape of Spoken Language Models: A Comprehensive Survey
Figure 3 for On The Landscape of Spoken Language Models: A Comprehensive Survey
Figure 4 for On The Landscape of Spoken Language Models: A Comprehensive Survey
Viaarxiv icon

Aligning Text-to-Music Evaluation with Human Preferences

Add code
Mar 20, 2025
Viaarxiv icon

ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems

Add code
Mar 11, 2025
Figure 1 for ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Figure 2 for ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Figure 3 for ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Figure 4 for ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Viaarxiv icon

Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics

Add code
Mar 03, 2025
Figure 1 for Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics
Figure 2 for Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics
Figure 3 for Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics
Figure 4 for Talking Turns: Benchmarking Audio Foundation Models on Turn-Taking Dynamics
Viaarxiv icon