Alert button

"speech recognition": models, code, and papers
Alert button

FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Jun 03, 2021
Yichong Leng, Xu Tan, Linchen Zhu, Jin Xu, Renqian Luo, Linquan Liu, Tao Qin, Xiang-Yang Li, Ed Lin, Tie-Yan Liu

Figure 1 for FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
Figure 2 for FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
Figure 3 for FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
Figure 4 for FastCorrect: Fast Error Correction with Edit Alignment for Automatic Speech Recognition
Viaarxiv icon

Online Automatic Speech Recognition with Listen, Attend and Spell Model

Aug 12, 2020
Roger Hsiao, Dogan Can, Tim Ng, Ruchir Travadi, Arnab Ghoshal

Figure 1 for Online Automatic Speech Recognition with Listen, Attend and Spell Model
Figure 2 for Online Automatic Speech Recognition with Listen, Attend and Spell Model
Figure 3 for Online Automatic Speech Recognition with Listen, Attend and Spell Model
Figure 4 for Online Automatic Speech Recognition with Listen, Attend and Spell Model
Viaarxiv icon

Multimodal Grounding for Sequence-to-Sequence Speech Recognition

Add code
Bookmark button
Alert button
Nov 09, 2018
Ozan Caglayan, Ramon Sanabria, Shruti Palaskar, Loïc Barrault, Florian Metze

Figure 1 for Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Figure 2 for Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Figure 3 for Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Figure 4 for Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Viaarxiv icon

FullPack: Full Vector Utilization for Sub-Byte Quantized Inference on General Purpose CPUs

Add code
Bookmark button
Alert button
Nov 20, 2022
Hossein Katebi, Navidreza Asadi, Maziar Goudarzi

Figure 1 for FullPack: Full Vector Utilization for Sub-Byte Quantized Inference on General Purpose CPUs
Figure 2 for FullPack: Full Vector Utilization for Sub-Byte Quantized Inference on General Purpose CPUs
Figure 3 for FullPack: Full Vector Utilization for Sub-Byte Quantized Inference on General Purpose CPUs
Figure 4 for FullPack: Full Vector Utilization for Sub-Byte Quantized Inference on General Purpose CPUs
Viaarxiv icon

Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews

Aug 19, 2019
Michael Gref, Christoph Schmidt, Sven Behnke, Joachim Köhler

Figure 1 for Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews
Figure 2 for Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews
Figure 3 for Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews
Figure 4 for Two-Staged Acoustic Modeling Adaption for Robust Speech Recognition by the Example of German Oral History Interviews
Viaarxiv icon

End-to-End Code Switching Language Models for Automatic Speech Recognition

Jun 16, 2020
Ahan M. R., Shreyas Sunil Kulkarni

Figure 1 for End-to-End Code Switching Language Models for Automatic Speech Recognition
Figure 2 for End-to-End Code Switching Language Models for Automatic Speech Recognition
Figure 3 for End-to-End Code Switching Language Models for Automatic Speech Recognition
Figure 4 for End-to-End Code Switching Language Models for Automatic Speech Recognition
Viaarxiv icon

Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget

Add code
Bookmark button
Alert button
Jun 15, 2021
Lukas Drude, Jahn Heymann, Andreas Schwarz, Jean-Marc Valin

Figure 1 for Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget
Figure 2 for Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget
Figure 3 for Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget
Figure 4 for Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget
Viaarxiv icon

Attention-based Transducer for Online Speech Recognition

May 18, 2020
Bin Wang, Yan Yin, Hui Lin

Figure 1 for Attention-based Transducer for Online Speech Recognition
Figure 2 for Attention-based Transducer for Online Speech Recognition
Figure 3 for Attention-based Transducer for Online Speech Recognition
Figure 4 for Attention-based Transducer for Online Speech Recognition
Viaarxiv icon

Is Lip Region-of-Interest Sufficient for Lipreading?

Add code
Bookmark button
Alert button
May 28, 2022
Jing-Xuan Zhang, Gen-Shun Wan, Jia Pan

Figure 1 for Is Lip Region-of-Interest Sufficient for Lipreading?
Figure 2 for Is Lip Region-of-Interest Sufficient for Lipreading?
Figure 3 for Is Lip Region-of-Interest Sufficient for Lipreading?
Figure 4 for Is Lip Region-of-Interest Sufficient for Lipreading?
Viaarxiv icon

Multi-head Monotonic Chunkwise Attention For Online Speech Recognition

May 01, 2020
Baiji Liu, Songjun Cao, Sining Sun, Weibin Zhang, Long Ma

Figure 1 for Multi-head Monotonic Chunkwise Attention For Online Speech Recognition
Figure 2 for Multi-head Monotonic Chunkwise Attention For Online Speech Recognition
Figure 3 for Multi-head Monotonic Chunkwise Attention For Online Speech Recognition
Viaarxiv icon