Picture for Atsunori Ogawa

Atsunori Ogawa

BLSTM-Based Confidence Estimation for End-to-End Speech Recognition

Dec 22, 2023
Viaarxiv icon

Lattice Rescoring Based on Large Ensemble of Complementary Neural Language Models

Dec 20, 2023
Viaarxiv icon

Iterative Shallow Fusion of Backward Language Model for End-to-End Speech Recognition

Oct 17, 2023
Viaarxiv icon

NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization

Sep 22, 2023
Figure 1 for NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization
Figure 2 for NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization
Figure 3 for NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization
Figure 4 for NTT speaker diarization system for CHiME-7: multi-domain, multi-microphone End-to-end and vector clustering diarization
Viaarxiv icon

Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization

Jun 07, 2023
Figure 1 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Figure 2 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Figure 3 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Figure 4 for Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization
Viaarxiv icon

Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data

May 25, 2023
Figure 1 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 2 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 3 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 4 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Viaarxiv icon

Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization

Add code
May 23, 2023
Figure 1 for Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Figure 2 for Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Figure 3 for Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization
Viaarxiv icon

Leveraging Large Text Corpora for End-to-End Speech Summarization

Add code
Mar 02, 2023
Figure 1 for Leveraging Large Text Corpora for End-to-End Speech Summarization
Figure 2 for Leveraging Large Text Corpora for End-to-End Speech Summarization
Figure 3 for Leveraging Large Text Corpora for End-to-End Speech Summarization
Figure 4 for Leveraging Large Text Corpora for End-to-End Speech Summarization
Viaarxiv icon

Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening

Mar 31, 2022
Figure 1 for Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening
Figure 2 for Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening
Figure 3 for Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening
Figure 4 for Subjective intelligibility of speech sounds enhanced by ideal ratio mask via crowdsourced remote experiments with effective data screening
Viaarxiv icon

Attention-based Multi-hypothesis Fusion for Speech Summarization

Add code
Nov 16, 2021
Figure 1 for Attention-based Multi-hypothesis Fusion for Speech Summarization
Figure 2 for Attention-based Multi-hypothesis Fusion for Speech Summarization
Figure 3 for Attention-based Multi-hypothesis Fusion for Speech Summarization
Figure 4 for Attention-based Multi-hypothesis Fusion for Speech Summarization
Viaarxiv icon