Picture for Naoyuki Kanda

Naoyuki Kanda

Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition

Add code
Jun 04, 2021
Figure 1 for Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition
Figure 2 for Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition
Figure 3 for Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition
Viaarxiv icon

Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone

Add code
Apr 12, 2021
Figure 1 for Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Figure 2 for Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Figure 3 for Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Viaarxiv icon

End-to-End Speaker-Attributed ASR with Transformer

Add code
Apr 05, 2021
Figure 1 for End-to-End Speaker-Attributed ASR with Transformer
Figure 2 for End-to-End Speaker-Attributed ASR with Transformer
Figure 3 for End-to-End Speaker-Attributed ASR with Transformer
Figure 4 for End-to-End Speaker-Attributed ASR with Transformer
Viaarxiv icon

Streaming Multi-talker Speech Recognition with Joint Speaker Identification

Add code
Apr 05, 2021
Figure 1 for Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Figure 2 for Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Figure 3 for Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Figure 4 for Streaming Multi-talker Speech Recognition with Joint Speaker Identification
Viaarxiv icon

Speech-language Pre-training for End-to-end Spoken Language Understanding

Add code
Feb 11, 2021
Figure 1 for Speech-language Pre-training for End-to-end Spoken Language Understanding
Figure 2 for Speech-language Pre-training for End-to-end Spoken Language Understanding
Figure 3 for Speech-language Pre-training for End-to-end Spoken Language Understanding
Figure 4 for Speech-language Pre-training for End-to-end Spoken Language Understanding
Viaarxiv icon

Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition

Add code
Feb 02, 2021
Figure 1 for Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition
Figure 2 for Internal Language Model Training for Domain-Adaptive End-to-End Speech Recognition
Viaarxiv icon

A Review of Speaker Diarization: Recent Advances with Deep Learning

Add code
Jan 24, 2021
Figure 1 for A Review of Speaker Diarization: Recent Advances with Deep Learning
Figure 2 for A Review of Speaker Diarization: Recent Advances with Deep Learning
Figure 3 for A Review of Speaker Diarization: Recent Advances with Deep Learning
Figure 4 for A Review of Speaker Diarization: Recent Advances with Deep Learning
Viaarxiv icon

Hypothesis Stitcher for End-to-End Speaker-attributed ASR on Long-form Multi-talker Recordings

Add code
Jan 06, 2021
Figure 1 for Hypothesis Stitcher for End-to-End Speaker-attributed ASR on Long-form Multi-talker Recordings
Figure 2 for Hypothesis Stitcher for End-to-End Speaker-attributed ASR on Long-form Multi-talker Recordings
Viaarxiv icon

Streaming end-to-end multi-talker speech recognition

Add code
Nov 26, 2020
Figure 1 for Streaming end-to-end multi-talker speech recognition
Figure 2 for Streaming end-to-end multi-talker speech recognition
Figure 3 for Streaming end-to-end multi-talker speech recognition
Figure 4 for Streaming end-to-end multi-talker speech recognition
Viaarxiv icon

Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR

Add code
Nov 03, 2020
Figure 1 for Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR
Figure 2 for Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR
Figure 3 for Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR
Viaarxiv icon