Alert button

"speech recognition": models, code, and papers
Alert button

Toward domain-invariant speech recognition via large scale training

Aug 16, 2018
Arun Narayanan, Ananya Misra, Khe Chai Sim, Golan Pundak, Anshuman Tripathi, Mohamed Elfeky, Parisa Haghani, Trevor Strohman, Michiel Bacchiani

Figure 1 for Toward domain-invariant speech recognition via large scale training
Figure 2 for Toward domain-invariant speech recognition via large scale training
Figure 3 for Toward domain-invariant speech recognition via large scale training
Figure 4 for Toward domain-invariant speech recognition via large scale training
Viaarxiv icon

Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation

Add code
Bookmark button
Alert button
Aug 11, 2021
Sarala Padi, Seyed Omid Sadjadi, Dinesh Manocha, Ram D. Sriram

Figure 1 for Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation
Figure 2 for Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation
Figure 3 for Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation
Figure 4 for Improved Speech Emotion Recognition using Transfer Learning and Spectrogram Augmentation
Viaarxiv icon

Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction

Add code
Bookmark button
Alert button
Oct 28, 2021
Heming Wang, Yao Qian, Xiaofei Wang, Yiming Wang, Chengyi Wang, Shujie Liu, Takuya Yoshioka, Jinyu Li, DeLiang Wang

Figure 1 for Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
Figure 2 for Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
Figure 3 for Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
Figure 4 for Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
Viaarxiv icon

Unsupervised Data Selection via Discrete Speech Representation for ASR

Apr 05, 2022
Zhiyun Lu, Yongqiang Wang, Yu Zhang, Wei Han, Zhehuai Chen, Parisa Haghani

Figure 1 for Unsupervised Data Selection via Discrete Speech Representation for ASR
Figure 2 for Unsupervised Data Selection via Discrete Speech Representation for ASR
Figure 3 for Unsupervised Data Selection via Discrete Speech Representation for ASR
Figure 4 for Unsupervised Data Selection via Discrete Speech Representation for ASR
Viaarxiv icon

Factorized Neural Transducer for Efficient Language Model Adaptation

Oct 07, 2021
Xie Chen, Zhong Meng, Sarangarajan Parthasarathy, Jinyu Li

Figure 1 for Factorized Neural Transducer for Efficient Language Model Adaptation
Figure 2 for Factorized Neural Transducer for Efficient Language Model Adaptation
Figure 3 for Factorized Neural Transducer for Efficient Language Model Adaptation
Figure 4 for Factorized Neural Transducer for Efficient Language Model Adaptation
Viaarxiv icon

Deep-FSMN for Large Vocabulary Continuous Speech Recognition

Add code
Bookmark button
Alert button
Mar 04, 2018
Shiliang Zhang, Ming Lei, Zhijie Yan, Lirong Dai

Figure 1 for Deep-FSMN for Large Vocabulary Continuous Speech Recognition
Figure 2 for Deep-FSMN for Large Vocabulary Continuous Speech Recognition
Figure 3 for Deep-FSMN for Large Vocabulary Continuous Speech Recognition
Figure 4 for Deep-FSMN for Large Vocabulary Continuous Speech Recognition
Viaarxiv icon

Deliberation Model for On-Device Spoken Language Understanding

Apr 04, 2022
Duc Le, Akshat Shrivastava, Paden Tomasello, Suyoun Kim, Aleksandr Livshits, Ozlem Kalinli, Michael L. Seltzer

Figure 1 for Deliberation Model for On-Device Spoken Language Understanding
Figure 2 for Deliberation Model for On-Device Spoken Language Understanding
Figure 3 for Deliberation Model for On-Device Spoken Language Understanding
Figure 4 for Deliberation Model for On-Device Spoken Language Understanding
Viaarxiv icon

Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition

May 11, 2015
Xiangang Li, Xihong Wu

Figure 1 for Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition
Figure 2 for Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition
Figure 3 for Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition
Figure 4 for Constructing Long Short-Term Memory based Deep Recurrent Neural Networks for Large Vocabulary Speech Recognition
Viaarxiv icon

Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models

Jul 22, 2019
Ke Hu, Antoine Bruguier, Tara N. Sainath, Rohit Prabhavalkar, Golan Pundak

Figure 1 for Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models
Figure 2 for Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models
Figure 3 for Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models
Figure 4 for Phoneme-Based Contextualization for Cross-Lingual Speech Recognition in End-to-End Models
Viaarxiv icon

Multimodal Depression Classification Using Articulatory Coordination Features And Hierarchical Attention Based Text Embeddings

Feb 13, 2022
Nadee Seneviratne, Carol Espy-Wilson

Figure 1 for Multimodal Depression Classification Using Articulatory Coordination Features And Hierarchical Attention Based Text Embeddings
Figure 2 for Multimodal Depression Classification Using Articulatory Coordination Features And Hierarchical Attention Based Text Embeddings
Figure 3 for Multimodal Depression Classification Using Articulatory Coordination Features And Hierarchical Attention Based Text Embeddings
Figure 4 for Multimodal Depression Classification Using Articulatory Coordination Features And Hierarchical Attention Based Text Embeddings
Viaarxiv icon