Alert button

"speech recognition": models, code, and papers
Alert button

Letter-Based Speech Recognition with Gated ConvNets

Add code
Bookmark button
Alert button
Dec 22, 2017
Vitaliy Liptchinsky, Gabriel Synnaeve, Ronan Collobert

Figure 1 for Letter-Based Speech Recognition with Gated ConvNets
Figure 2 for Letter-Based Speech Recognition with Gated ConvNets
Figure 3 for Letter-Based Speech Recognition with Gated ConvNets
Figure 4 for Letter-Based Speech Recognition with Gated ConvNets
Viaarxiv icon

CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition

Add code
Bookmark button
Alert button
Jul 18, 2022
Xin-Cheng Wen, Jia-Xin Ye, Yan Luo, Yong Xu, Xuan-Ze Wang, Chang-Li Wu, Kun-Hong Liu

Figure 1 for CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition
Figure 2 for CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition
Figure 3 for CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition
Figure 4 for CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition
Viaarxiv icon

XTREME-S: Evaluating Cross-lingual Speech Representations

Add code
Bookmark button
Alert button
Apr 13, 2022
Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan Van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson

Figure 1 for XTREME-S: Evaluating Cross-lingual Speech Representations
Figure 2 for XTREME-S: Evaluating Cross-lingual Speech Representations
Figure 3 for XTREME-S: Evaluating Cross-lingual Speech Representations
Figure 4 for XTREME-S: Evaluating Cross-lingual Speech Representations
Viaarxiv icon

NN-grams: Unifying neural network and n-gram language models for Speech Recognition

Jun 23, 2016
Babak Damavandi, Shankar Kumar, Noam Shazeer, Antoine Bruguier

Figure 1 for NN-grams: Unifying neural network and n-gram language models for Speech Recognition
Figure 2 for NN-grams: Unifying neural network and n-gram language models for Speech Recognition
Figure 3 for NN-grams: Unifying neural network and n-gram language models for Speech Recognition
Figure 4 for NN-grams: Unifying neural network and n-gram language models for Speech Recognition
Viaarxiv icon

Attentive Fusion Enhanced Audio-Visual Encoding for Transformer Based Robust Speech Recognition

Aug 06, 2020
Liangfa Wei, Jie Zhang, Junfeng Hou, Lirong Dai

Figure 1 for Attentive Fusion Enhanced Audio-Visual Encoding for Transformer Based Robust Speech Recognition
Figure 2 for Attentive Fusion Enhanced Audio-Visual Encoding for Transformer Based Robust Speech Recognition
Figure 3 for Attentive Fusion Enhanced Audio-Visual Encoding for Transformer Based Robust Speech Recognition
Figure 4 for Attentive Fusion Enhanced Audio-Visual Encoding for Transformer Based Robust Speech Recognition
Viaarxiv icon

Gaussian Kernelized Self-Attention for Long Sequence Data and Its Application to CTC-based Speech Recognition

Feb 18, 2021
Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe

Figure 1 for Gaussian Kernelized Self-Attention for Long Sequence Data and Its Application to CTC-based Speech Recognition
Figure 2 for Gaussian Kernelized Self-Attention for Long Sequence Data and Its Application to CTC-based Speech Recognition
Figure 3 for Gaussian Kernelized Self-Attention for Long Sequence Data and Its Application to CTC-based Speech Recognition
Figure 4 for Gaussian Kernelized Self-Attention for Long Sequence Data and Its Application to CTC-based Speech Recognition
Viaarxiv icon

Contextual Speech Recognition with Difficult Negative Training Examples

Oct 29, 2018
Uri Alon, Golan Pundak, Tara N. Sainath

Figure 1 for Contextual Speech Recognition with Difficult Negative Training Examples
Figure 2 for Contextual Speech Recognition with Difficult Negative Training Examples
Figure 3 for Contextual Speech Recognition with Difficult Negative Training Examples
Figure 4 for Contextual Speech Recognition with Difficult Negative Training Examples
Viaarxiv icon

Interpretable Dysarthric Speaker Adaptation based on Optimal-Transport

Mar 14, 2022
Rosanna Turrisi, Leonardo Badino

Figure 1 for Interpretable Dysarthric Speaker Adaptation based on Optimal-Transport
Figure 2 for Interpretable Dysarthric Speaker Adaptation based on Optimal-Transport
Figure 3 for Interpretable Dysarthric Speaker Adaptation based on Optimal-Transport
Viaarxiv icon

VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge

Add code
Bookmark button
Alert button
Feb 20, 2023
Jaesung Huh, Andrew Brown, Jee-weon Jung, Joon Son Chung, Arsha Nagrani, Daniel Garcia-Romero, Andrew Zisserman

Figure 1 for VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Figure 2 for VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Figure 3 for VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Figure 4 for VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Viaarxiv icon

Serialized Output Training for End-to-End Overlapped Speech Recognition

Mar 28, 2020
Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Takuya Yoshioka

Figure 1 for Serialized Output Training for End-to-End Overlapped Speech Recognition
Figure 2 for Serialized Output Training for End-to-End Overlapped Speech Recognition
Figure 3 for Serialized Output Training for End-to-End Overlapped Speech Recognition
Figure 4 for Serialized Output Training for End-to-End Overlapped Speech Recognition
Viaarxiv icon