Picture for Yuki Saito

Yuki Saito

CraBERT: Efficient Phoneme Encoder Pre-Training via Cascade Fusion of Subword Representations for Text-to-Speech

Add code
Jun 15, 2026
Viaarxiv icon

Low-Latency Real-Time Audio Game Commentary System via LLM-Based Parallel Text Generation

Add code
Jun 11, 2026
Viaarxiv icon

Probing Token Spaces under Generator Shift in AI-Generated Music Detection

Add code
Jun 07, 2026
Viaarxiv icon

Do speech foundation models perceive speaker similarity as humans do?

Add code
Jun 04, 2026
Viaarxiv icon

Kinetic-Optimal Scheduling with Moment Correction for Metric-Induced Discrete Flow Matching in Zero-Shot Text-to-Speech

Add code
May 10, 2026
Viaarxiv icon

DialogueSidon: Recovering Full-Duplex Dialogue Tracks from In-the-Wild Dialogue Audio

Add code
Apr 13, 2026
Viaarxiv icon

DecompGrind: A Decomposition Framework for Robotic Grinding via Cutting-Surface Planning and Contact-Force Adaptation

Add code
Mar 24, 2026
Viaarxiv icon

Reference-Free Image Quality Assessment for Virtual Try-On via Human Feedback

Add code
Mar 13, 2026
Viaarxiv icon

Real-Time Generation of Game Video Commentary with Multimodal LLMs: Pause-Aware Decoding Approaches

Add code
Mar 03, 2026
Viaarxiv icon

Geneses: Unified Generative Speech Enhancement and Separation

Add code
Jan 26, 2026
Viaarxiv icon