Alert button

"speech recognition": models, code, and papers
Alert button

EURO: ESPnet Unsupervised ASR Open-source Toolkit

Add code
Bookmark button
Alert button
Dec 01, 2022
Dongji Gao, Jiatong Shi, Shun-Po Chuang, Leibny Paola Garcia, Hung-yi Lee, Shinji Watanabe, Sanjeev Khudanpur

Figure 1 for EURO: ESPnet Unsupervised ASR Open-source Toolkit
Figure 2 for EURO: ESPnet Unsupervised ASR Open-source Toolkit
Figure 3 for EURO: ESPnet Unsupervised ASR Open-source Toolkit
Figure 4 for EURO: ESPnet Unsupervised ASR Open-source Toolkit
Viaarxiv icon

BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer

Add code
Bookmark button
Alert button
Apr 19, 2023
Lucas Georges Gabriel Charpentier, Sondre Wold, David Samuel, Egil Rønningstad

Figure 1 for BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer
Figure 2 for BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer
Figure 3 for BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer
Figure 4 for BRENT: Bidirectional Retrieval Enhanced Norwegian Transformer
Viaarxiv icon

Time-frequency Network for Robust Speaker Recognition

Add code
Bookmark button
Alert button
Mar 07, 2023
Jiguo Li, Tianzi Zhang, Xiaobin Liu, Lirong Zheng

Figure 1 for Time-frequency Network for Robust Speaker Recognition
Figure 2 for Time-frequency Network for Robust Speaker Recognition
Figure 3 for Time-frequency Network for Robust Speaker Recognition
Figure 4 for Time-frequency Network for Robust Speaker Recognition
Viaarxiv icon

SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing

Add code
Bookmark button
Alert button
Feb 27, 2023
Weidong Chen, Xiaofen Xing, Xiangmin Xu, Jianxin Pang, Lan Du

Figure 1 for SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Figure 2 for SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Figure 3 for SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Figure 4 for SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Viaarxiv icon

Improving Speech-to-Speech Translation Through Unlabeled Text

Add code
Bookmark button
Alert button
Oct 26, 2022
Xuan-Phi Nguyen, Sravya Popuri, Changhan Wang, Yun Tang, Ilia Kulikov, Hongyu Gong

Figure 1 for Improving Speech-to-Speech Translation Through Unlabeled Text
Figure 2 for Improving Speech-to-Speech Translation Through Unlabeled Text
Figure 3 for Improving Speech-to-Speech Translation Through Unlabeled Text
Figure 4 for Improving Speech-to-Speech Translation Through Unlabeled Text
Viaarxiv icon

Adapting GPT, GPT-2 and BERT Language Models for Speech Recognition

Add code
Bookmark button
Alert button
Jul 29, 2021
Xianrui Zheng, Chao Zhang, Philip C. Woodland

Figure 1 for Adapting GPT, GPT-2 and BERT Language Models for Speech Recognition
Figure 2 for Adapting GPT, GPT-2 and BERT Language Models for Speech Recognition
Figure 3 for Adapting GPT, GPT-2 and BERT Language Models for Speech Recognition
Figure 4 for Adapting GPT, GPT-2 and BERT Language Models for Speech Recognition
Viaarxiv icon

Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation

Add code
Bookmark button
Alert button
Oct 27, 2022
Marvin Lavechin, Marianne Métais, Hadrien Titeux, Alodie Boissonnet, Jade Copet, Morgane Rivière, Elika Bergelson, Alejandrina Cristia, Emmanuel Dupoux, Hervé Bredin

Figure 1 for Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation
Figure 2 for Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation
Figure 3 for Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation
Figure 4 for Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation
Viaarxiv icon

The Role of Phonetic Units in Speech Emotion Recognition

Aug 02, 2021
Jiahong Yuan, Xingyu Cai, Renjie Zheng, Liang Huang, Kenneth Church

Figure 1 for The Role of Phonetic Units in Speech Emotion Recognition
Figure 2 for The Role of Phonetic Units in Speech Emotion Recognition
Figure 3 for The Role of Phonetic Units in Speech Emotion Recognition
Figure 4 for The Role of Phonetic Units in Speech Emotion Recognition
Viaarxiv icon

Massively Multilingual Adversarial Speech Recognition

Add code
Bookmark button
Alert button
Apr 03, 2019
Oliver Adams, Matthew Wiesner, Shinji Watanabe, David Yarowsky

Figure 1 for Massively Multilingual Adversarial Speech Recognition
Figure 2 for Massively Multilingual Adversarial Speech Recognition
Figure 3 for Massively Multilingual Adversarial Speech Recognition
Figure 4 for Massively Multilingual Adversarial Speech Recognition
Viaarxiv icon

End-to-End Speech to Intent Prediction to improve E-commerce Customer Support Voicebot in Hindi and English

Oct 26, 2022
Abhinav Goyal, Anupam Singh, Nikesh Garera

Figure 1 for End-to-End Speech to Intent Prediction to improve E-commerce Customer Support Voicebot in Hindi and English
Figure 2 for End-to-End Speech to Intent Prediction to improve E-commerce Customer Support Voicebot in Hindi and English
Figure 3 for End-to-End Speech to Intent Prediction to improve E-commerce Customer Support Voicebot in Hindi and English
Figure 4 for End-to-End Speech to Intent Prediction to improve E-commerce Customer Support Voicebot in Hindi and English
Viaarxiv icon