Picture for Takuya Yoshioka

Takuya Yoshioka

Target conversation extraction: Source separation using turn-taking dynamics

Add code
Jul 15, 2024
Viaarxiv icon

Look Once to Hear: Target Speech Hearing with Noisy Examples

Add code
May 10, 2024
Viaarxiv icon

Anatomy of Industrial Scale Multilingual ASR

Add code
Apr 16, 2024
Viaarxiv icon

Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables

Add code
Nov 01, 2023
Viaarxiv icon

Profile-Error-Tolerant Target-Speaker Voice Activity Detection

Add code
Sep 21, 2023
Figure 1 for Profile-Error-Tolerant Target-Speaker Voice Activity Detection
Figure 2 for Profile-Error-Tolerant Target-Speaker Voice Activity Detection
Figure 3 for Profile-Error-Tolerant Target-Speaker Voice Activity Detection
Figure 4 for Profile-Error-Tolerant Target-Speaker Voice Activity Detection
Viaarxiv icon

t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability

Add code
Sep 15, 2023
Figure 1 for t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability
Figure 2 for t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability
Figure 3 for t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability
Figure 4 for t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability
Viaarxiv icon

DiariST: Streaming Speech Translation with Speaker Diarization

Add code
Sep 14, 2023
Viaarxiv icon

SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

Add code
Aug 14, 2023
Figure 1 for SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
Figure 2 for SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
Figure 3 for SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
Figure 4 for SpeechX: Neural Codec Language Model as a Versatile Speech Transformer
Viaarxiv icon

Adapting Multi-Lingual ASR Models for Handling Multiple Talkers

Add code
May 30, 2023
Figure 1 for Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Figure 2 for Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Figure 3 for Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Figure 4 for Adapting Multi-Lingual ASR Models for Handling Multiple Talkers
Viaarxiv icon

i-Code Studio: A Configurable and Composable Framework for Integrative AI

Add code
May 23, 2023
Figure 1 for i-Code Studio: A Configurable and Composable Framework for Integrative AI
Figure 2 for i-Code Studio: A Configurable and Composable Framework for Integrative AI
Figure 3 for i-Code Studio: A Configurable and Composable Framework for Integrative AI
Figure 4 for i-Code Studio: A Configurable and Composable Framework for Integrative AI
Viaarxiv icon