Picture for Shinji Watanabe

Shinji Watanabe

Carnegie Mellon University

Learning Robust Spatial Representations from Binaural Audio through Feature Distillation

Add code
Aug 28, 2025
Viaarxiv icon

Geolocation-Aware Robust Spoken Language Identification

Add code
Aug 23, 2025
Viaarxiv icon

Music Arena: Live Evaluation for Text-to-Music

Add code
Jul 28, 2025
Viaarxiv icon

OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder

Add code
Jul 18, 2025
Viaarxiv icon

Improving Speech Enhancement with Multi-Metric Supervision from Learned Quality Assessment

Add code
Jun 13, 2025
Viaarxiv icon

Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs

Add code
Jun 12, 2025
Viaarxiv icon

Discrete Audio Tokens: More Than a Survey!

Add code
Jun 12, 2025
Viaarxiv icon

Streaming Endpointer for Spoken Dialogue using Neural Audio Codecs and Label-Delayed Training

Add code
Jun 08, 2025
Viaarxiv icon

Improving Multilingual Speech Models on ML-SUPERB 2.0: Fine-tuning with Data Augmentation and LID-Aware CTC

Add code
May 30, 2025
Viaarxiv icon

ARECHO: Autoregressive Evaluation via Chain-Based Hypothesis Optimization for Speech Multi-Metric Estimation

Add code
May 30, 2025
Viaarxiv icon