Alert button
Picture for Takaaki Hori

Takaaki Hori

Alert button

End-to-End Speech Recognition: A Survey

Add code
Bookmark button
Alert button
Mar 03, 2023
Rohit Prabhavalkar, Takaaki Hori, Tara N. Sainath, Ralf Schlüter, Shinji Watanabe

Figure 1 for End-to-End Speech Recognition: A Survey
Figure 2 for End-to-End Speech Recognition: A Survey
Figure 3 for End-to-End Speech Recognition: A Survey
Figure 4 for End-to-End Speech Recognition: A Survey
Viaarxiv icon

Variable Attention Masking for Configurable Transformer Transducer Speech Recognition

Add code
Bookmark button
Alert button
Nov 02, 2022
Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Honza Silovsky, Ruchir Travadi, Xiaodan Zhuang

Figure 1 for Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
Figure 2 for Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
Figure 3 for Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
Figure 4 for Variable Attention Masking for Configurable Transformer Transducer Speech Recognition
Viaarxiv icon

Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR

Add code
Bookmark button
Alert button
Mar 01, 2022
Xuankai Chang, Niko Moritz, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux

Figure 1 for Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR
Figure 2 for Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR
Figure 3 for Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR
Figure 4 for Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR
Viaarxiv icon

Sequence Transduction with Graph-based Supervision

Add code
Bookmark button
Alert button
Nov 01, 2021
Niko Moritz, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux

Figure 1 for Sequence Transduction with Graph-based Supervision
Figure 2 for Sequence Transduction with Graph-based Supervision
Figure 3 for Sequence Transduction with Graph-based Supervision
Viaarxiv icon

Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning

Add code
Bookmark button
Alert button
Oct 13, 2021
Ankit P. Shah, Shijie Geng, Peng Gao, Anoop Cherian, Takaaki Hori, Tim K. Marks, Jonathan Le Roux, Chiori Hori

Figure 1 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Figure 2 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Figure 3 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Figure 4 for Audio-Visual Scene-Aware Dialog and Reasoning using Audio-Visual Transformers with Joint Student-Teacher Learning
Viaarxiv icon

Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy

Add code
Bookmark button
Alert button
Oct 11, 2021
Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori

Figure 1 for Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Figure 2 for Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Figure 3 for Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Viaarxiv icon

Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers

Add code
Bookmark button
Alert button
Aug 04, 2021
Chiori Hori, Takaaki Hori, Jonathan Le Roux

Figure 1 for Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers
Figure 2 for Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers
Figure 3 for Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers
Figure 4 for Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers
Viaarxiv icon

Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Jul 02, 2021
Niko Moritz, Takaaki Hori, Jonathan Le Roux

Figure 1 for Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition
Figure 2 for Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition
Viaarxiv icon

Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition

Add code
Bookmark button
Alert button
Jun 16, 2021
Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori

Figure 1 for Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition
Figure 2 for Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition
Viaarxiv icon

Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers

Add code
Bookmark button
Alert button
Apr 19, 2021
Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux

Figure 1 for Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers
Figure 2 for Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers
Figure 3 for Advanced Long-context End-to-end Speech Recognition Using Context-expanded Transformers
Viaarxiv icon