Picture for Shinji Watanabe

Shinji Watanabe

Carnegie Mellon University

Comparative Reasoning: Making an Audio Language Model Better at Comparing Emotions

Add code
Jun 23, 2026
Viaarxiv icon

Evaluating Large Language Models Abilities for Addressee, Turn-change, and Next Speaker Prediction in Meetings

Add code
Jun 16, 2026
Viaarxiv icon

Grounding Spoken LLMs in Multi-Speaker Audio via Diarization Conditioning

Add code
Jun 16, 2026
Viaarxiv icon

Endpoint Anticipation for Low-Latency Spoken Dialogue

Add code
Jun 11, 2026
Viaarxiv icon

ANCHOR: Autoregressive Non-intrusive Chunk-Ordered Refinement for Joint Multi-Resolution Speech Quality Modeling

Add code
Jun 08, 2026
Viaarxiv icon

TRADE: Transducer-Augmented Decoder for Speech LLM

Add code
Jun 07, 2026
Viaarxiv icon

Benchmarking Speech-to-Speech Translation Models

Add code
Jun 02, 2026
Viaarxiv icon

Adapting Text LLMs to Speech via Multimodal Depth Up-Scaling

Add code
Apr 01, 2026
Viaarxiv icon

An Empirical Recipe for Universal Phone Recognition

Add code
Mar 30, 2026
Viaarxiv icon

Who Spoke What When? Evaluating Spoken Language Models for Conversational ASR with Semantic and Overlap-Aware Metrics

Add code
Mar 24, 2026
Viaarxiv icon