Alert button

"speech recognition": models, code, and papers
Alert button

A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition

Jun 23, 2020
Dongwei Jiang, Wubo Li, Ruixiong Zhang, Miao Cao, Ne Luo, Yang Han, Wei Zou, Xiangang Li

Figure 1 for A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition
Figure 2 for A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition
Figure 3 for A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition
Figure 4 for A Further Study of Unsupervised Pre-training for Transformer Based Speech Recognition
Viaarxiv icon

Essence Knowledge Distillation for Speech Recognition

Jun 26, 2019
Zhenchuan Yang, Chun Zhang, Weibin Zhang, Jianxiu Jin, Dongpeng Chen

Figure 1 for Essence Knowledge Distillation for Speech Recognition
Figure 2 for Essence Knowledge Distillation for Speech Recognition
Figure 3 for Essence Knowledge Distillation for Speech Recognition
Figure 4 for Essence Knowledge Distillation for Speech Recognition
Viaarxiv icon

Efficient spike encoding algorithms for neuromorphic speech recognition

Jul 14, 2022
Sidi Yaya Arnaud Yarga, Jean Rouat, Sean U. N. Wood

Figure 1 for Efficient spike encoding algorithms for neuromorphic speech recognition
Figure 2 for Efficient spike encoding algorithms for neuromorphic speech recognition
Figure 3 for Efficient spike encoding algorithms for neuromorphic speech recognition
Figure 4 for Efficient spike encoding algorithms for neuromorphic speech recognition
Viaarxiv icon

Large Raw Emotional Dataset with Aggregation Mechanism

Dec 23, 2022
Vladimir Kondratenko, Artem Sokolov, Nikolay Karpov, Oleg Kutuzov, Nikita Savushkin, Fyodor Minkin

Figure 1 for Large Raw Emotional Dataset with Aggregation Mechanism
Figure 2 for Large Raw Emotional Dataset with Aggregation Mechanism
Figure 3 for Large Raw Emotional Dataset with Aggregation Mechanism
Figure 4 for Large Raw Emotional Dataset with Aggregation Mechanism
Viaarxiv icon

End-to-end Audio-visual Speech Recognition with Conformers

Feb 12, 2021
Pingchuan Ma, Stavros Petridis, Maja Pantic

Figure 1 for End-to-end Audio-visual Speech Recognition with Conformers
Figure 2 for End-to-end Audio-visual Speech Recognition with Conformers
Figure 3 for End-to-end Audio-visual Speech Recognition with Conformers
Figure 4 for End-to-end Audio-visual Speech Recognition with Conformers
Viaarxiv icon

Multitask Learning from Augmented Auxiliary Data for Improving Speech Emotion Recognition

Jul 12, 2022
Siddique Latif, Rajib Rana, Sara Khalifa, Raja Jurdak, Björn W. Schuller

Figure 1 for Multitask Learning from Augmented Auxiliary Data for Improving Speech Emotion Recognition
Figure 2 for Multitask Learning from Augmented Auxiliary Data for Improving Speech Emotion Recognition
Figure 3 for Multitask Learning from Augmented Auxiliary Data for Improving Speech Emotion Recognition
Figure 4 for Multitask Learning from Augmented Auxiliary Data for Improving Speech Emotion Recognition
Viaarxiv icon

U2++: Unified Two-pass Bidirectional End-to-end Model for Speech Recognition

Jul 07, 2021
Di Wu, Binbin Zhang, Chao Yang, Zhendong Peng, Wenjing Xia, Xiaoyu Chen, Xin Lei

Figure 1 for U2++: Unified Two-pass Bidirectional End-to-end Model for Speech Recognition
Figure 2 for U2++: Unified Two-pass Bidirectional End-to-end Model for Speech Recognition
Figure 3 for U2++: Unified Two-pass Bidirectional End-to-end Model for Speech Recognition
Figure 4 for U2++: Unified Two-pass Bidirectional End-to-end Model for Speech Recognition
Viaarxiv icon

Semantic Data Augmentation for End-to-End Mandarin Speech Recognition

Apr 26, 2021
Jianwei Sun, Zhiyuan Tang, Hengxin Yin, Wei Wang, Xi Zhao, Shuaijiang Zhao, Xiaoning Lei, Wei Zou, Xiangang Li

Figure 1 for Semantic Data Augmentation for End-to-End Mandarin Speech Recognition
Figure 2 for Semantic Data Augmentation for End-to-End Mandarin Speech Recognition
Figure 3 for Semantic Data Augmentation for End-to-End Mandarin Speech Recognition
Figure 4 for Semantic Data Augmentation for End-to-End Mandarin Speech Recognition
Viaarxiv icon

Can we use Common Voice to train a Multi-Speaker TTS system?

Oct 12, 2022
Sewade Ogun, Vincent Colotte, Emmanuel Vincent

Figure 1 for Can we use Common Voice to train a Multi-Speaker TTS system?
Figure 2 for Can we use Common Voice to train a Multi-Speaker TTS system?
Figure 3 for Can we use Common Voice to train a Multi-Speaker TTS system?
Figure 4 for Can we use Common Voice to train a Multi-Speaker TTS system?
Viaarxiv icon

A context-aware knowledge transferring strategy for CTC-based ASR

Oct 12, 2022
Ke-Han Lu, Kuan-Yu Chen

Figure 1 for A context-aware knowledge transferring strategy for CTC-based ASR
Figure 2 for A context-aware knowledge transferring strategy for CTC-based ASR
Figure 3 for A context-aware knowledge transferring strategy for CTC-based ASR
Figure 4 for A context-aware knowledge transferring strategy for CTC-based ASR
Viaarxiv icon