Picture for Shinji Watanabe

Shinji Watanabe

Carnegie Mellon University

Who Spoke What When? Evaluating Spoken Language Models for Conversational ASR with Semantic and Overlap-Aware Metrics

Add code
Mar 24, 2026
Viaarxiv icon

ADEPT: RL-Aligned Agentic Decoding of Emotion via Evidence Probing Tools -- From Consensus Learning to Ambiguity-Driven Emotion Reasoning

Add code
Feb 13, 2026
Viaarxiv icon

Bagpiper: Solving Open-Ended Audio Tasks via Rich Captions

Add code
Feb 05, 2026
Viaarxiv icon

LALM-as-a-Judge: Benchmarking Large Audio-Language Models for Safety Evaluation in Multi-Turn Spoken Dialogues

Add code
Feb 04, 2026
Viaarxiv icon

CALM: Joint Contextual Acoustic-Linguistic Modeling for Personalization of Multi-Speaker ASR

Add code
Jan 30, 2026
Viaarxiv icon

Optimizing Conversational Quality in Spoken Dialogue Systems with Reinforcement Learning from AI Feedback

Add code
Jan 27, 2026
Viaarxiv icon

The CMU-AIST submission for the ICME 2025 Audio Encoder Challenge

Add code
Jan 22, 2026
Viaarxiv icon

ICASSP 2026 URGENT Speech Enhancement Challenge

Add code
Jan 20, 2026
Viaarxiv icon

PRiSM: Benchmarking Phone Realization in Speech Models

Add code
Jan 20, 2026
Viaarxiv icon

Do Neural Codecs Generalize? A Controlled Study Across Unseen Languages and Non-Speech Tasks

Add code
Jan 18, 2026
Viaarxiv icon