Picture for Shinji Watanabe

Shinji Watanabe

Carnegie Mellon University

Learning Robust Spatial Representations from Binaural Audio through Feature Distillation

Add code
Aug 28, 2025
Viaarxiv icon

Geolocation-Aware Robust Spoken Language Identification

Add code
Aug 23, 2025
Viaarxiv icon

Music Arena: Live Evaluation for Text-to-Music

Add code
Jul 28, 2025
Viaarxiv icon

OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder

Add code
Jul 18, 2025
Figure 1 for OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder
Figure 2 for OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder
Figure 3 for OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder
Figure 4 for OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder
Viaarxiv icon

Improving Speech Enhancement with Multi-Metric Supervision from Learned Quality Assessment

Add code
Jun 13, 2025
Viaarxiv icon

Discrete Audio Tokens: More Than a Survey!

Add code
Jun 12, 2025
Viaarxiv icon

Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs

Add code
Jun 12, 2025
Viaarxiv icon

Streaming Endpointer for Spoken Dialogue using Neural Audio Codecs and Label-Delayed Training

Add code
Jun 08, 2025
Viaarxiv icon

Explainable Depression Detection using Masked Hard Instance Mining

Add code
May 30, 2025
Viaarxiv icon

Improving Multilingual Speech Models on ML-SUPERB 2.0: Fine-tuning with Data Augmentation and LID-Aware CTC

Add code
May 30, 2025
Viaarxiv icon