Alert button

"speech recognition": models, code, and papers
Alert button

LongFNT: Long-form Speech Recognition with Factorized Neural Transducer

Nov 17, 2022
Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian

Figure 1 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Figure 2 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Figure 3 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Figure 4 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Viaarxiv icon

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

Add code
Bookmark button
Alert button
Jul 07, 2023
Le Zhuo, Ruibin Yuan, Jiahao Pan, Yinghao Ma, Yizhi LI, Ge Zhang, Si Liu, Roger Dannenberg, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenhu Chen, Wei Xue, Yike Guo

Figure 1 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Figure 2 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Figure 3 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Figure 4 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Viaarxiv icon

Speech-dependent Modeling of Own Voice Transfer Characteristics for In-ear Microphones in Hearables

Sep 15, 2023
Mattes Ohlenbusch, Christian Rollwage, Simon Doclo

Viaarxiv icon

End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation

Add code
Bookmark button
Alert button
Apr 01, 2022
Xuankai Chang, Takashi Maekaku, Yuya Fujita, Shinji Watanabe

Figure 1 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Figure 2 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Figure 3 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Figure 4 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Viaarxiv icon

Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters

Jul 02, 2023
Anshu Bhatia, Sanchit Sinha, Saket Dingliwal, Karthik Gopalakrishnan, Sravan Bodapati, Katrin Kirchhoff

Figure 1 for Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Figure 2 for Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Figure 3 for Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Figure 4 for Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Viaarxiv icon

FOOCTTS: Generating Arabic Speech with Acoustic Environment for Football Commentator

Add code
Bookmark button
Alert button
Jun 07, 2023
Massa Baali, Ahmed Ali

Figure 1 for FOOCTTS: Generating Arabic Speech with Acoustic Environment for Football Commentator
Figure 2 for FOOCTTS: Generating Arabic Speech with Acoustic Environment for Football Commentator
Viaarxiv icon

Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data

May 16, 2022
Alëna Aksënova, Zhehuai Chen, Chung-Cheng Chiu, Daan van Esch, Pavel Golik, Wei Han, Levi King, Bhuvana Ramabhadran, Andrew Rosenberg, Suzan Schwartz, Gary Wang

Figure 1 for Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data
Figure 2 for Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data
Figure 3 for Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data
Viaarxiv icon

Adaptive Multi-Corpora Language Model Training for Speech Recognition

Nov 09, 2022
Yingyi Ma, Zhe Liu, Xuedong Zhang

Figure 1 for Adaptive Multi-Corpora Language Model Training for Speech Recognition
Figure 2 for Adaptive Multi-Corpora Language Model Training for Speech Recognition
Figure 3 for Adaptive Multi-Corpora Language Model Training for Speech Recognition
Figure 4 for Adaptive Multi-Corpora Language Model Training for Speech Recognition
Viaarxiv icon

Continual Learning for End-to-End ASR by Averaging Domain Experts

May 12, 2023
Peter Plantinga, Jaekwon Yoo, Chandra Dhir

Figure 1 for Continual Learning for End-to-End ASR by Averaging Domain Experts
Figure 2 for Continual Learning for End-to-End ASR by Averaging Domain Experts
Figure 3 for Continual Learning for End-to-End ASR by Averaging Domain Experts
Figure 4 for Continual Learning for End-to-End ASR by Averaging Domain Experts
Viaarxiv icon

Towards the Transferable Audio Adversarial Attack via Ensemble Methods

Add code
Bookmark button
Alert button
Apr 18, 2023
Feng Guo, Zheng Sun, Yuxuan Chen, Lei Ju

Figure 1 for Towards the Transferable Audio Adversarial Attack via Ensemble Methods
Figure 2 for Towards the Transferable Audio Adversarial Attack via Ensemble Methods
Figure 3 for Towards the Transferable Audio Adversarial Attack via Ensemble Methods
Figure 4 for Towards the Transferable Audio Adversarial Attack via Ensemble Methods
Viaarxiv icon