Alert button

"speech recognition": models, code, and papers
Alert button

Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning

Jun 23, 2023
Zhongzhi Yu, Yang Zhang, Kaizhi Qian, Yonggan Fu, Yingyan Lin

Figure 1 for Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
Figure 2 for Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
Figure 3 for Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
Figure 4 for Master-ASR: Achieving Multilingual Scalability and Low-Resource Adaptation in ASR with Modular Learning
Viaarxiv icon

SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Dec 02, 2022
Yichong Leng, Xu Tan, Wenjie Liu, Kaitao Song, Rui Wang, Xiang-Yang Li, Tao Qin, Edward Lin, Tie-Yan Liu

Figure 1 for SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition
Figure 2 for SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition
Figure 3 for SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition
Figure 4 for SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition
Viaarxiv icon

Can Contextual Biasing Remain Effective with Whisper and GPT-2?

Add code
Bookmark button
Alert button
Jun 02, 2023
Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C. Woodland

Figure 1 for Can Contextual Biasing Remain Effective with Whisper and GPT-2?
Figure 2 for Can Contextual Biasing Remain Effective with Whisper and GPT-2?
Figure 3 for Can Contextual Biasing Remain Effective with Whisper and GPT-2?
Figure 4 for Can Contextual Biasing Remain Effective with Whisper and GPT-2?
Viaarxiv icon

Improved DeepFake Detection Using Whisper Features

Add code
Bookmark button
Alert button
Jun 02, 2023
Piotr Kawa, Marcin Plata, Michał Czuba, Piotr Szymański, Piotr Syga

Figure 1 for Improved DeepFake Detection Using Whisper Features
Figure 2 for Improved DeepFake Detection Using Whisper Features
Figure 3 for Improved DeepFake Detection Using Whisper Features
Figure 4 for Improved DeepFake Detection Using Whisper Features
Viaarxiv icon

DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction

Add code
Bookmark button
Alert button
May 26, 2023
Vineet Bhat, Preethi Jyothi, Pushpak Bhattacharyya

Figure 1 for DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction
Figure 2 for DisfluencyFixer: A tool to enhance Language Learning through Speech To Speech Disfluency Correction
Viaarxiv icon

On the Robustness of Arabic Speech Dialect Identification

Jun 01, 2023
Peter Sullivan, AbdelRahim Elmadany, Muhammad Abdul-Mageed

Figure 1 for On the Robustness of Arabic Speech Dialect Identification
Figure 2 for On the Robustness of Arabic Speech Dialect Identification
Figure 3 for On the Robustness of Arabic Speech Dialect Identification
Figure 4 for On the Robustness of Arabic Speech Dialect Identification
Viaarxiv icon

Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning

Add code
Bookmark button
Alert button
May 29, 2023
Xuankai Chang, Brian Yan, Yuya Fujita, Takashi Maekaku, Shinji Watanabe

Figure 1 for Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning
Figure 2 for Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning
Figure 3 for Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning
Figure 4 for Exploration of Efficient End-to-End ASR using Discretized Input from Self-Supervised Learning
Viaarxiv icon

Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation

May 29, 2023
Yui Sudo, Kazuya Hata, Kazuhiro Nakadai

Viaarxiv icon

BIG-C: a Multimodal Multi-Purpose Dataset for Bemba

Add code
Bookmark button
Alert button
May 26, 2023
Claytone Sikasote, Eunice Mukonde, Md Mahfuz Ibn Alam, Antonios Anastasopoulos

Figure 1 for BIG-C: a Multimodal Multi-Purpose Dataset for Bemba
Figure 2 for BIG-C: a Multimodal Multi-Purpose Dataset for Bemba
Figure 3 for BIG-C: a Multimodal Multi-Purpose Dataset for Bemba
Figure 4 for BIG-C: a Multimodal Multi-Purpose Dataset for Bemba
Viaarxiv icon

Unsupervised domain adaptation for speech recognition with unsupervised error correction

Sep 24, 2022
Long Mai, Julie Carson-Berndsen

Figure 1 for Unsupervised domain adaptation for speech recognition with unsupervised error correction
Figure 2 for Unsupervised domain adaptation for speech recognition with unsupervised error correction
Figure 3 for Unsupervised domain adaptation for speech recognition with unsupervised error correction
Figure 4 for Unsupervised domain adaptation for speech recognition with unsupervised error correction
Viaarxiv icon