Alert button

"speech recognition": models, code, and papers
Alert button

Fully Convolutional Speech Recognition

Add code
Bookmark button
Alert button
Dec 17, 2018
Neil Zeghidour, Qiantong Xu, Vitaliy Liptchinsky, Nicolas Usunier, Gabriel Synnaeve, Ronan Collobert

Figure 1 for Fully Convolutional Speech Recognition
Figure 2 for Fully Convolutional Speech Recognition
Figure 3 for Fully Convolutional Speech Recognition
Figure 4 for Fully Convolutional Speech Recognition
Viaarxiv icon

Separator-Transducer-Segmenter: Streaming Recognition and Segmentation of Multi-party Speech

May 10, 2022
Ilya Sklyar, Anna Piunova, Christian Osendorfer

Figure 1 for Separator-Transducer-Segmenter: Streaming Recognition and Segmentation of Multi-party Speech
Figure 2 for Separator-Transducer-Segmenter: Streaming Recognition and Segmentation of Multi-party Speech
Figure 3 for Separator-Transducer-Segmenter: Streaming Recognition and Segmentation of Multi-party Speech
Figure 4 for Separator-Transducer-Segmenter: Streaming Recognition and Segmentation of Multi-party Speech
Viaarxiv icon

UniSpeech at scale: An Empirical Study of Pre-training Method on Large-Scale Speech Recognition Dataset

Add code
Bookmark button
Alert button
Jul 12, 2021
Chengyi Wang, Yu Wu, Shujie Liu, Jinyu Li, Yao Qian, Kenichi Kumatani, Furu Wei

Figure 1 for UniSpeech at scale: An Empirical Study of Pre-training Method on Large-Scale Speech Recognition Dataset
Figure 2 for UniSpeech at scale: An Empirical Study of Pre-training Method on Large-Scale Speech Recognition Dataset
Figure 3 for UniSpeech at scale: An Empirical Study of Pre-training Method on Large-Scale Speech Recognition Dataset
Figure 4 for UniSpeech at scale: An Empirical Study of Pre-training Method on Large-Scale Speech Recognition Dataset
Viaarxiv icon

Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End

May 14, 2021
Swayambhu Nath Ray, Minhua Wu, Anirudh Raju, Pegah Ghahremani, Raghavendra Bilgi, Milind Rao, Harish Arsikere, Ariya Rastrow, Andreas Stolcke, Jasha Droppo

Figure 1 for Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Figure 2 for Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Figure 3 for Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Figure 4 for Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Viaarxiv icon

Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search

Apr 13, 2021
Yukun Liu, Ta Li, Pengyuan Zhang, Yonghong Yan

Figure 1 for Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search
Figure 2 for Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search
Figure 3 for Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search
Figure 4 for Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search
Viaarxiv icon

Can we use Common Voice to train a Multi-Speaker TTS system?

Add code
Bookmark button
Alert button
Oct 12, 2022
Sewade Ogun, Vincent Colotte, Emmanuel Vincent

Figure 1 for Can we use Common Voice to train a Multi-Speaker TTS system?
Figure 2 for Can we use Common Voice to train a Multi-Speaker TTS system?
Figure 3 for Can we use Common Voice to train a Multi-Speaker TTS system?
Figure 4 for Can we use Common Voice to train a Multi-Speaker TTS system?
Viaarxiv icon

A context-aware knowledge transferring strategy for CTC-based ASR

Add code
Bookmark button
Alert button
Oct 12, 2022
Ke-Han Lu, Kuan-Yu Chen

Figure 1 for A context-aware knowledge transferring strategy for CTC-based ASR
Figure 2 for A context-aware knowledge transferring strategy for CTC-based ASR
Figure 3 for A context-aware knowledge transferring strategy for CTC-based ASR
Figure 4 for A context-aware knowledge transferring strategy for CTC-based ASR
Viaarxiv icon

Gradient-Adjusted Neuron Activation Profiles for Comprehensive Introspection of Convolutional Speech Recognition Models

Feb 19, 2020
Andreas Krug, Sebastian Stober

Figure 1 for Gradient-Adjusted Neuron Activation Profiles for Comprehensive Introspection of Convolutional Speech Recognition Models
Figure 2 for Gradient-Adjusted Neuron Activation Profiles for Comprehensive Introspection of Convolutional Speech Recognition Models
Figure 3 for Gradient-Adjusted Neuron Activation Profiles for Comprehensive Introspection of Convolutional Speech Recognition Models
Figure 4 for Gradient-Adjusted Neuron Activation Profiles for Comprehensive Introspection of Convolutional Speech Recognition Models
Viaarxiv icon

Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR

Oct 11, 2022
Dongseong Hwang, Khe Chai Sim, Yu Zhang, Trevor Strohman

Figure 1 for Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR
Figure 2 for Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR
Figure 3 for Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR
Figure 4 for Comparison of Soft and Hard Target RNN-T Distillation for Large-scale ASR
Viaarxiv icon