Picture for Naohiro Tawara

Naohiro Tawara

Applying LLMs for Rescoring N-best ASR Hypotheses of Casual Conversations: Effects of Domain Adaptation and Context Carry-over

Add code
Jun 27, 2024
Viaarxiv icon

BLSTM-Based Confidence Estimation for End-to-End Speech Recognition

Add code
Dec 22, 2023
Viaarxiv icon

Lattice Rescoring Based on Large Ensemble of Complementary Neural Language Models

Add code
Dec 20, 2023
Viaarxiv icon

Iterative Shallow Fusion of Backward Language Model for End-to-End Speech Recognition

Add code
Oct 17, 2023
Figure 1 for Iterative Shallow Fusion of Backward Language Model for End-to-End Speech Recognition
Figure 2 for Iterative Shallow Fusion of Backward Language Model for End-to-End Speech Recognition
Figure 3 for Iterative Shallow Fusion of Backward Language Model for End-to-End Speech Recognition
Figure 4 for Iterative Shallow Fusion of Backward Language Model for End-to-End Speech Recognition
Viaarxiv icon

Discriminative Training of VBx Diarization

Add code
Oct 04, 2023
Figure 1 for Discriminative Training of VBx Diarization
Figure 2 for Discriminative Training of VBx Diarization
Figure 3 for Discriminative Training of VBx Diarization
Viaarxiv icon

NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization

Add code
Sep 22, 2023
Figure 1 for NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization
Figure 2 for NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization
Figure 3 for NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization
Figure 4 for NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization
Viaarxiv icon

Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization

Add code
May 23, 2023
Figure 1 for Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Figure 2 for Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Figure 3 for Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Viaarxiv icon

Advances in integration of end-to-end neural and clustering-based diarization for real conversational speech

Add code
May 19, 2021
Figure 1 for Advances in integration of end-to-end neural and clustering-based diarization for real conversational speech
Figure 2 for Advances in integration of end-to-end neural and clustering-based diarization for real conversational speech
Figure 3 for Advances in integration of end-to-end neural and clustering-based diarization for real conversational speech
Figure 4 for Advances in integration of end-to-end neural and clustering-based diarization for real conversational speech
Viaarxiv icon

Integrating end-to-end neural and clustering-based diarization: Getting the best of both worlds

Add code
Oct 26, 2020
Figure 1 for Integrating end-to-end neural and clustering-based diarization: Getting the best of both worlds
Figure 2 for Integrating end-to-end neural and clustering-based diarization: Getting the best of both worlds
Figure 3 for Integrating end-to-end neural and clustering-based diarization: Getting the best of both worlds
Figure 4 for Integrating end-to-end neural and clustering-based diarization: Getting the best of both worlds
Viaarxiv icon

Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam

Add code
Jan 23, 2020
Figure 1 for Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam
Figure 2 for Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam
Figure 3 for Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam
Figure 4 for Improving speaker discrimination of target speech extraction with time-domain SpeakerBeam
Viaarxiv icon