Picture for Shinji Watanabe

Shinji Watanabe

Carnegie Mellon University

Endpoint Anticipation for Low-Latency Spoken Dialogue

Add code
Jun 11, 2026
Viaarxiv icon

ANCHOR: Autoregressive Non-intrusive Chunk-Ordered Refinement for Joint Multi-Resolution Speech Quality Modeling

Add code
Jun 08, 2026
Viaarxiv icon

TRADE: Transducer-Augmented Decoder for Speech LLM

Add code
Jun 07, 2026
Viaarxiv icon

Benchmarking Speech-to-Speech Translation Models

Add code
Jun 02, 2026
Viaarxiv icon

Adapting Text LLMs to Speech via Multimodal Depth Up-Scaling

Add code
Apr 01, 2026
Viaarxiv icon

An Empirical Recipe for Universal Phone Recognition

Add code
Mar 30, 2026
Viaarxiv icon

Who Spoke What When? Evaluating Spoken Language Models for Conversational ASR with Semantic and Overlap-Aware Metrics

Add code
Mar 24, 2026
Viaarxiv icon

ADEPT: RL-Aligned Agentic Decoding of Emotion via Evidence Probing Tools -- From Consensus Learning to Ambiguity-Driven Emotion Reasoning

Add code
Feb 13, 2026
Viaarxiv icon

Bagpiper: Solving Open-Ended Audio Tasks via Rich Captions

Add code
Feb 05, 2026
Viaarxiv icon

LALM-as-a-Judge: Benchmarking Large Audio-Language Models for Safety Evaluation in Multi-Turn Spoken Dialogues

Add code
Feb 04, 2026
Viaarxiv icon