Alert button

"speech recognition": models, code, and papers
Alert button

End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning

Jul 07, 2021
Tomohiro Tanaka, Ryo Masumura, Mana Ihori, Akihiko Takashima, Shota Orihashi, Naoki Makishima

Figure 1 for End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Figure 2 for End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Figure 3 for End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Figure 4 for End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning
Viaarxiv icon

Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR

Oct 11, 2022
Dongseong Hwang, Khe Chai Sim, Yu Zhang, Trevor Strohman

Figure 1 for Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR
Figure 2 for Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR
Figure 3 for Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR
Figure 4 for Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR
Viaarxiv icon

Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture

Jan 15, 2020
Haoran Miao, Gaofeng Cheng, Changfeng Gao, Pengyuan Zhang, Yonghong Yan

Figure 1 for Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Figure 2 for Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Figure 3 for Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Figure 4 for Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Viaarxiv icon

Spectral Modification Based Data Augmentation For Improving End-to-End ASR For Children's Speech

Mar 13, 2022
Vishwanath Pratap Singh, Hardik Sailor, Supratik Bhattacharya, Abhishek Pandey

Figure 1 for Spectral Modification Based Data Augmentation For Improving End-to-End ASR For Children's Speech
Figure 2 for Spectral Modification Based Data Augmentation For Improving End-to-End ASR For Children's Speech
Figure 3 for Spectral Modification Based Data Augmentation For Improving End-to-End ASR For Children's Speech
Figure 4 for Spectral Modification Based Data Augmentation For Improving End-to-End ASR For Children's Speech
Viaarxiv icon

Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios

Jun 07, 2021
Emiru Tsunoo, Kentaro Shibata, Chaitanya Narisetty, Yosuke Kashiwagi, Shinji Watanabe

Figure 1 for Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios
Figure 2 for Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios
Figure 3 for Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios
Figure 4 for Data Augmentation Methods for End-to-end Speech Recognition on Distant-Talk Scenarios
Viaarxiv icon

Delay-penalized transducer for low-latency streaming ASR

Add code
Bookmark button
Alert button
Oct 31, 2022
Wei Kang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Xiaoyu Yang, Long lin, Piotr Żelasko, Daniel Povey

Figure 1 for Delay-penalized transducer for low-latency streaming ASR
Figure 2 for Delay-penalized transducer for low-latency streaming ASR
Figure 3 for Delay-penalized transducer for low-latency streaming ASR
Figure 4 for Delay-penalized transducer for low-latency streaming ASR
Viaarxiv icon

Unsupervised Cross-lingual Representation Learning for Speech Recognition

Add code
Bookmark button
Alert button
Jun 24, 2020
Alexis Conneau, Alexei Baevski, Ronan Collobert, Abdelrahman Mohamed, Michael Auli

Figure 1 for Unsupervised Cross-lingual Representation Learning for Speech Recognition
Figure 2 for Unsupervised Cross-lingual Representation Learning for Speech Recognition
Figure 3 for Unsupervised Cross-lingual Representation Learning for Speech Recognition
Figure 4 for Unsupervised Cross-lingual Representation Learning for Speech Recognition
Viaarxiv icon

AHD ConvNet for Speech Emotion Classification

Jun 21, 2022
Asfand Ali, Danial Nasir, Mohammad Hassan Jawad

Figure 1 for AHD ConvNet for Speech Emotion Classification
Figure 2 for AHD ConvNet for Speech Emotion Classification
Figure 3 for AHD ConvNet for Speech Emotion Classification
Figure 4 for AHD ConvNet for Speech Emotion Classification
Viaarxiv icon

Robust Unstructured Knowledge Access in Conversational Dialogue with ASR Errors

Add code
Bookmark button
Alert button
Nov 08, 2022
Yik-Cheung Tam, Jiacheng Xu, Jiakai Zou, Zecheng Wang, Tinglong Liao, Shuhan Yuan

Figure 1 for Robust Unstructured Knowledge Access in Conversational Dialogue with ASR Errors
Figure 2 for Robust Unstructured Knowledge Access in Conversational Dialogue with ASR Errors
Figure 3 for Robust Unstructured Knowledge Access in Conversational Dialogue with ASR Errors
Figure 4 for Robust Unstructured Knowledge Access in Conversational Dialogue with ASR Errors
Viaarxiv icon

PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition

Add code
Bookmark button
Alert button
Jun 10, 2021
Cheng-I Jeff Lai, Yang Zhang, Alexander H. Liu, Shiyu Chang, Yi-Lun Liao, Yung-Sung Chuang, Kaizhi Qian, Sameer Khurana, David Cox, James Glass

Figure 1 for PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
Figure 2 for PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
Figure 3 for PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
Figure 4 for PARP: Prune, Adjust and Re-Prune for Self-Supervised Speech Recognition
Viaarxiv icon