Alert button

"speech recognition": models, code, and papers
Alert button

Streaming Voice Conversion Via Intermediate Bottleneck Features And Non-streaming Teacher Guidance

Add code
Bookmark button
Alert button
Oct 27, 2022
Yuanzhe Chen, Ming Tu, Tang Li, Xin Li, Qiuqiang Kong, Jiaxin Li, Zhichao Wang, Qiao Tian, Yuping Wang, Yuxuan Wang

Figure 1 for Streaming Voice Conversion Via Intermediate Bottleneck Features And Non-streaming Teacher Guidance
Figure 2 for Streaming Voice Conversion Via Intermediate Bottleneck Features And Non-streaming Teacher Guidance
Figure 3 for Streaming Voice Conversion Via Intermediate Bottleneck Features And Non-streaming Teacher Guidance
Figure 4 for Streaming Voice Conversion Via Intermediate Bottleneck Features And Non-streaming Teacher Guidance
Viaarxiv icon

On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches

Nov 16, 2022
Guilherme Schu, Parvaneh Janbakhshi, Ina Kodrasi

Figure 1 for On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches
Figure 2 for On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches
Figure 3 for On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches
Figure 4 for On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches
Viaarxiv icon

A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition

Jun 09, 2021
Shigeki Karita, Yotaro Kubo, Michiel Adriaan Unico Bacchiani, Llion Jones

Figure 1 for A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Figure 2 for A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Figure 3 for A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Figure 4 for A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition
Viaarxiv icon

Multimodal Speech Emotion Recognition using Cross Attention with Aligned Audio and Text

Jul 26, 2022
Yoonhyung Lee, Seunghyun Yoon, Kyomin Jung

Figure 1 for Multimodal Speech Emotion Recognition using Cross Attention with Aligned Audio and Text
Figure 2 for Multimodal Speech Emotion Recognition using Cross Attention with Aligned Audio and Text
Figure 3 for Multimodal Speech Emotion Recognition using Cross Attention with Aligned Audio and Text
Figure 4 for Multimodal Speech Emotion Recognition using Cross Attention with Aligned Audio and Text
Viaarxiv icon

Speech Aware Dialog System Technology Challenge (DSTC11)

Dec 16, 2022
Hagen Soltau, Izhak Shafran, Mingqiu Wang, Abhinav Rastogi, Jeffrey Zhao, Ye Jia, Wei Han, Yuan Cao, Aramys Miranda

Figure 1 for Speech Aware Dialog System Technology Challenge (DSTC11)
Figure 2 for Speech Aware Dialog System Technology Challenge (DSTC11)
Figure 3 for Speech Aware Dialog System Technology Challenge (DSTC11)
Figure 4 for Speech Aware Dialog System Technology Challenge (DSTC11)
Viaarxiv icon

Bridging Speech and Textual Pre-trained Models with Unsupervised ASR

Add code
Bookmark button
Alert button
Nov 06, 2022
Jiatong Shi, Chan-Jan Hsu, Holam Chung, Dongji Gao, Paola Garcia, Shinji Watanabe, Ann Lee, Hung-yi Lee

Figure 1 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Figure 2 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Figure 3 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Figure 4 for Bridging Speech and Textual Pre-trained Models with Unsupervised ASR
Viaarxiv icon

A Graph Isomorphism Network with Weighted Multiple Aggregators for Speech Emotion Recognition

Jul 03, 2022
Ying Hu, Yuwu Tang, Hao Huang, Liang He

Figure 1 for A Graph Isomorphism Network with Weighted Multiple Aggregators for Speech Emotion Recognition
Figure 2 for A Graph Isomorphism Network with Weighted Multiple Aggregators for Speech Emotion Recognition
Figure 3 for A Graph Isomorphism Network with Weighted Multiple Aggregators for Speech Emotion Recognition
Figure 4 for A Graph Isomorphism Network with Weighted Multiple Aggregators for Speech Emotion Recognition
Viaarxiv icon

Iterative Pseudo-Labeling for Speech Recognition

May 19, 2020
Qiantong Xu, Tatiana Likhomanenko, Jacob Kahn, Awni Hannun, Gabriel Synnaeve, Ronan Collobert

Figure 1 for Iterative Pseudo-Labeling for Speech Recognition
Figure 2 for Iterative Pseudo-Labeling for Speech Recognition
Figure 3 for Iterative Pseudo-Labeling for Speech Recognition
Figure 4 for Iterative Pseudo-Labeling for Speech Recognition
Viaarxiv icon

Efficient Use of Large Pre-Trained Models for Low Resource ASR

Add code
Bookmark button
Alert button
Oct 26, 2022
Peter Vieting, Christoph Lüscher, Julian Dierkes, Ralf Schlüter, Hermann Ney

Figure 1 for Efficient Use of Large Pre-Trained Models for Low Resource ASR
Figure 2 for Efficient Use of Large Pre-Trained Models for Low Resource ASR
Figure 3 for Efficient Use of Large Pre-Trained Models for Low Resource ASR
Figure 4 for Efficient Use of Large Pre-Trained Models for Low Resource ASR
Viaarxiv icon

Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism

Jul 02, 2022
Kun Wei, Pengcheng Guo, Ning Jiang

Figure 1 for Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism
Figure 2 for Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism
Figure 3 for Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism
Figure 4 for Improving Transformer-based Conversational ASR by Inter-Sentential Attention Mechanism
Viaarxiv icon