Alert button

"speech recognition": models, code, and papers
Alert button

Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure

Add code
Bookmark button
Alert button
Jul 04, 2023
Yikang Wang, Hiromitsu Nishizaki, Ming Li

Figure 1 for Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure
Figure 2 for Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure
Figure 3 for Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure
Figure 4 for Pretraining Conformer with ASR or ASV for Anti-Spoofing Countermeasure
Viaarxiv icon

Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR

May 18, 2023
Hang Shao, Wei Wang, Bei Liu, Xun Gong, Haoyu Wang, Yanmin Qian

Figure 1 for Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Figure 2 for Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Figure 3 for Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR
Viaarxiv icon

Cross-Modal Mutual Learning for Cued Speech Recognition

Dec 02, 2022
Lei Liu, Li Liu

Figure 1 for Cross-Modal Mutual Learning for Cued Speech Recognition
Figure 2 for Cross-Modal Mutual Learning for Cued Speech Recognition
Figure 3 for Cross-Modal Mutual Learning for Cued Speech Recognition
Figure 4 for Cross-Modal Mutual Learning for Cued Speech Recognition
Viaarxiv icon

Accelerating Transducers through Adjacent Token Merging

Jun 28, 2023
Yuang Li, Yu Wu, Jinyu Li, Shujie Liu

Figure 1 for Accelerating Transducers through Adjacent Token Merging
Figure 2 for Accelerating Transducers through Adjacent Token Merging
Figure 3 for Accelerating Transducers through Adjacent Token Merging
Figure 4 for Accelerating Transducers through Adjacent Token Merging
Viaarxiv icon

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

Add code
Bookmark button
Alert button
Jul 07, 2023
Le Zhuo, Ruibin Yuan, Jiahao Pan, Yinghao Ma, Yizhi LI, Ge Zhang, Si Liu, Roger Dannenberg, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenhu Chen, Wei Xue, Yike Guo

Figure 1 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Figure 2 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Figure 3 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Figure 4 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Viaarxiv icon

Blind Signal Dereverberation for Machine Speech Recognition

Sep 30, 2022
Samik Sadhu, Hynek Hermansky

Figure 1 for Blind Signal Dereverberation for Machine Speech Recognition
Figure 2 for Blind Signal Dereverberation for Machine Speech Recognition
Figure 3 for Blind Signal Dereverberation for Machine Speech Recognition
Figure 4 for Blind Signal Dereverberation for Machine Speech Recognition
Viaarxiv icon

Modeling Spoken Information Queries for Virtual Assistants: Open Problems, Challenges and Opportunities

Apr 25, 2023
Christophe Van Gysel

Viaarxiv icon

Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters

Jul 02, 2023
Anshu Bhatia, Sanchit Sinha, Saket Dingliwal, Karthik Gopalakrishnan, Sravan Bodapati, Katrin Kirchhoff

Figure 1 for Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Figure 2 for Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Figure 3 for Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Figure 4 for Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Viaarxiv icon

An Empirical Study and Improvement for Speech Emotion Recognition

Add code
Bookmark button
Alert button
Apr 08, 2023
Zhen Wu, Yizhe Lu, Xinyu Dai

Figure 1 for An Empirical Study and Improvement for Speech Emotion Recognition
Figure 2 for An Empirical Study and Improvement for Speech Emotion Recognition
Figure 3 for An Empirical Study and Improvement for Speech Emotion Recognition
Figure 4 for An Empirical Study and Improvement for Speech Emotion Recognition
Viaarxiv icon

TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline

Jun 27, 2022
Chengfei Li, Shuhao Deng, Yaoping Wang, Guangjing Wang, Yaguang Gong, Changbin Chen, Jinfeng Bai

Figure 1 for TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline
Figure 2 for TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline
Figure 3 for TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline
Figure 4 for TALCS: An Open-Source Mandarin-English Code-Switching Corpus and a Speech Recognition Baseline
Viaarxiv icon