Picture for Shinji Watanabe

Shinji Watanabe

Carnegie Mellon University

Full-Duplex-Bench-v2: A Multi-Turn Evaluation Framework for Duplex Dialogue Systems with an Automated Examiner

Add code
Oct 09, 2025
Viaarxiv icon

Evaluating Self-Supervised Speech Models via Text-Based LLMS

Add code
Oct 06, 2025
Viaarxiv icon

Chain-of-Thought Reasoning in Streaming Full-Duplex End-to-End Spoken Dialogue Systems

Add code
Oct 02, 2025
Viaarxiv icon

Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage

Add code
Oct 02, 2025
Figure 1 for Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage
Figure 2 for Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage
Figure 3 for Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage
Figure 4 for Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage
Viaarxiv icon

CS-FLEURS: A Massively Multilingual and Code-Switched Speech Dataset

Add code
Sep 17, 2025
Viaarxiv icon

MAPSS: Manifold-based Assessment of Perceptual Source Separation

Add code
Sep 11, 2025
Viaarxiv icon

The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties

Add code
Sep 08, 2025
Figure 1 for The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties
Figure 2 for The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties
Figure 3 for The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties
Figure 4 for The ML-SUPERB 2.0 Challenge: Towards Inclusive ASR Benchmarking for All Language Varieties
Viaarxiv icon

Learning Robust Spatial Representations from Binaural Audio through Feature Distillation

Add code
Aug 28, 2025
Viaarxiv icon

Geolocation-Aware Robust Spoken Language Identification

Add code
Aug 23, 2025
Viaarxiv icon

Music Arena: Live Evaluation for Text-to-Music

Add code
Jul 28, 2025
Viaarxiv icon