Picture for Emiru Tsunoo

Emiru Tsunoo

DialogueSidon: Recovering Full-Duplex Dialogue Tracks from In-the-Wild Dialogue Audio

Add code
Apr 13, 2026
Viaarxiv icon

Optimizing Conversational Quality in Spoken Dialogue Systems with Reinforcement Learning from AI Feedback

Add code
Jan 27, 2026
Viaarxiv icon

Chain-of-Thought Reasoning in Streaming Full-Duplex End-to-End Spoken Dialogue Systems

Add code
Oct 02, 2025
Viaarxiv icon

LibriTTS-VI: A Public Corpus and Novel Methods for Efficient Voice Impression Control

Add code
Sep 19, 2025
Viaarxiv icon

Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs

Add code
Jun 12, 2025
Figure 1 for Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs
Figure 2 for Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs
Figure 3 for Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs
Figure 4 for Scheduled Interleaved Speech-Text Training for Speech-to-Speech Translation with LLMs
Viaarxiv icon

Differentiable K-means for Fully-optimized Discrete Token-based ASR

Add code
May 22, 2025
Viaarxiv icon

ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems

Add code
Mar 11, 2025
Figure 1 for ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Figure 2 for ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Figure 3 for ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Figure 4 for ESPnet-SDS: Unified Toolkit and Demo for Spoken Dialogue Systems
Viaarxiv icon

Causal Speech Enhancement with Predicting Semantics based on Quantized Self-supervised Learning Features

Add code
Dec 26, 2024
Viaarxiv icon

Task Arithmetic for Language Expansion in Speech Translation

Add code
Sep 17, 2024
Figure 1 for Task Arithmetic for Language Expansion in Speech Translation
Figure 2 for Task Arithmetic for Language Expansion in Speech Translation
Figure 3 for Task Arithmetic for Language Expansion in Speech Translation
Figure 4 for Task Arithmetic for Language Expansion in Speech Translation
Viaarxiv icon

Decoder-only Architecture for Streaming End-to-end Speech Recognition

Add code
Jun 23, 2024
Figure 1 for Decoder-only Architecture for Streaming End-to-end Speech Recognition
Figure 2 for Decoder-only Architecture for Streaming End-to-end Speech Recognition
Figure 3 for Decoder-only Architecture for Streaming End-to-end Speech Recognition
Viaarxiv icon