Picture for Takanori Ashihara

Takanori Ashihara

SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling

Add code
Jul 01, 2024
Figure 1 for SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling
Figure 2 for SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling
Figure 3 for SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling
Viaarxiv icon

Lightweight Zero-shot Text-to-Speech with Mixture of Adapters

Add code
Jul 01, 2024
Viaarxiv icon

Applying LLMs for Rescoring N-best ASR Hypotheses of Casual Conversations: Effects of Domain Adaptation and Context Carry-over

Add code
Jun 27, 2024
Viaarxiv icon

Probing Self-supervised Learning Models with Target Speech Extraction

Add code
Feb 17, 2024
Viaarxiv icon

What Do Self-Supervised Speech and Speaker Models Learn? New Findings From a Cross Model Layer-Wise Analysis

Add code
Jan 31, 2024
Viaarxiv icon

Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters

Add code
Jan 10, 2024
Figure 1 for Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters
Figure 2 for Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters
Figure 3 for Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters
Figure 4 for Noise-robust zero-shot text-to-speech synthesis conditioned on self-supervised speech-representation model with adapters
Viaarxiv icon

SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?

Add code
Jun 14, 2023
Figure 1 for SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Figure 2 for SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Figure 3 for SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Figure 4 for SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Viaarxiv icon

Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization

Add code
Jun 07, 2023
Figure 1 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Figure 2 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Figure 3 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Figure 4 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Viaarxiv icon

Improving Scheduled Sampling for Neural Transducer-based ASR

Add code
May 25, 2023
Figure 1 for Improving Scheduled Sampling for Neural Transducer-based ASR
Figure 2 for Improving Scheduled Sampling for Neural Transducer-based ASR
Figure 3 for Improving Scheduled Sampling for Neural Transducer-based ASR
Figure 4 for Improving Scheduled Sampling for Neural Transducer-based ASR
Viaarxiv icon

Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data

Add code
May 25, 2023
Figure 1 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 2 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 3 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 4 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Viaarxiv icon