Alert button

"speech recognition": models, code, and papers
Alert button

HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation

Jun 20, 2023
Cihan Xiao, Henry Li Xinyuan, Jinyi Yang, Dongji Gao, Matthew Wiesner, Kevin Duh, Sanjeev Khudanpur

Figure 1 for HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
Figure 2 for HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
Figure 3 for HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
Figure 4 for HK-LegiCoST: Leveraging Non-Verbatim Transcripts for Speech Translation
Viaarxiv icon

Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands

Jul 06, 2022
Wenliang Dai, Samuel Cahyawijaya, Tiezheng Yu, Elham J Barezi, Pascale Fung

Figure 1 for Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands
Figure 2 for Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands
Figure 3 for Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands
Figure 4 for Kaggle Competition: Cantonese Audio-Visual Speech Recognition for In-car Commands
Viaarxiv icon

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

Jun 29, 2023
Le Zhuo, Ruibin Yuan, Jiahao Pan, Yinghao Ma, Yizhi LI, Ge Zhang, Si Liu, Roger Dannenberg, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenhu Chen, Wei Xue, Yike Guo

Figure 1 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Figure 2 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Figure 3 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Figure 4 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Viaarxiv icon

Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities

Jul 22, 2022
Pranav Dheram, Murugesan Ramakrishnan, Anirudh Raju, I-Fan Chen, Brian King, Katherine Powell, Melissa Saboowala, Karan Shetty, Andreas Stolcke

Figure 1 for Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities
Figure 2 for Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities
Figure 3 for Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities
Figure 4 for Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities
Viaarxiv icon

Bengali Common Voice Speech Dataset for Automatic Speech Recognition

Jun 29, 2022
Samiul Alam, Asif Sushmit, Zaowad Abdullah, Shahrin Nakkhatra, MD. Nazmuddoha Ansary, Syed Mobassir Hossen, Sazia Morshed Mehnaz, Tahsin Reasat, Ahmed Imtiaz Humayun

Figure 1 for Bengali Common Voice Speech Dataset for Automatic Speech Recognition
Figure 2 for Bengali Common Voice Speech Dataset for Automatic Speech Recognition
Figure 3 for Bengali Common Voice Speech Dataset for Automatic Speech Recognition
Figure 4 for Bengali Common Voice Speech Dataset for Automatic Speech Recognition
Viaarxiv icon

RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models

May 24, 2023
David Qiu, David Rim, Shaojin Ding, Oleg Rybakov, Yanzhang He

Figure 1 for RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models
Figure 2 for RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models
Figure 3 for RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models
Figure 4 for RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models
Viaarxiv icon

Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone Mapping, Features Input, and Source Language Selection

Jun 21, 2023
Phat Do, Matt Coler, Jelske Dijkstra, Esther Klabbers

Figure 1 for Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone Mapping, Features Input, and Source Language Selection
Figure 2 for Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone Mapping, Features Input, and Source Language Selection
Figure 3 for Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone Mapping, Features Input, and Source Language Selection
Figure 4 for Strategies in Transfer Learning for Low-Resource Speech Synthesis: Phone Mapping, Features Input, and Source Language Selection
Viaarxiv icon

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios

Jun 23, 2023
Samuele Cornell, Matthew Wiesner, Shinji Watanabe, Desh Raj, Xuankai Chang, Paola Garcia, Yoshiki Masuyama, Zhong-Qiu Wang, Stefano Squartini, Sanjeev Khudanpur

Figure 1 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 2 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 3 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Figure 4 for The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios
Viaarxiv icon

Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning

Aug 05, 2022
Sandy Ritchie, You-Chi Cheng, Mingqing Chen, Rajiv Mathews, Daan van Esch, Bo Li, Khe Chai Sim

Figure 1 for Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning
Figure 2 for Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning
Figure 3 for Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning
Figure 4 for Large vocabulary speech recognition for languages of Africa: multilingual modeling and self-supervised learning
Viaarxiv icon

Towards the Transferable Audio Adversarial Attack via Ensemble Methods

Apr 18, 2023
Feng Guo, Zheng Sun, Yuxuan Chen, Lei Ju

Figure 1 for Towards the Transferable Audio Adversarial Attack via Ensemble Methods
Figure 2 for Towards the Transferable Audio Adversarial Attack via Ensemble Methods
Figure 3 for Towards the Transferable Audio Adversarial Attack via Ensemble Methods
Figure 4 for Towards the Transferable Audio Adversarial Attack via Ensemble Methods
Viaarxiv icon