Alert button

"speech recognition": models, code, and papers
Alert button

Efficient Active Learning for Automatic Speech Recognition via Augmented Consistency Regularization

Jun 19, 2020
Jihwan Bang, Heesu Kim, YoungJoon Yoo, Jung-Woo Ha

Figure 1 for Efficient Active Learning for Automatic Speech Recognition via Augmented Consistency Regularization
Figure 2 for Efficient Active Learning for Automatic Speech Recognition via Augmented Consistency Regularization
Figure 3 for Efficient Active Learning for Automatic Speech Recognition via Augmented Consistency Regularization
Figure 4 for Efficient Active Learning for Automatic Speech Recognition via Augmented Consistency Regularization
Viaarxiv icon

Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems

Add code
Bookmark button
Alert button
Dec 19, 2019
Nick Rossenbach, Albert Zeyer, Ralf Schlüter, Hermann Ney

Figure 1 for Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems
Figure 2 for Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems
Figure 3 for Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems
Figure 4 for Generating Synthetic Audio Data for Attention-Based Speech Recognition Systems
Viaarxiv icon

Kernel Approximation Methods for Speech Recognition

Jan 13, 2017
Avner May, Alireza Bagheri Garakani, Zhiyun Lu, Dong Guo, Kuan Liu, Aurélien Bellet, Linxi Fan, Michael Collins, Daniel Hsu, Brian Kingsbury, Michael Picheny, Fei Sha

Figure 1 for Kernel Approximation Methods for Speech Recognition
Figure 2 for Kernel Approximation Methods for Speech Recognition
Figure 3 for Kernel Approximation Methods for Speech Recognition
Figure 4 for Kernel Approximation Methods for Speech Recognition
Viaarxiv icon

CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition

Add code
Bookmark button
Alert button
Jul 18, 2022
Xin-Cheng Wen, Jia-Xin Ye, Yan Luo, Yong Xu, Xuan-Ze Wang, Chang-Li Wu, Kun-Hong Liu

Figure 1 for CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition
Figure 2 for CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition
Figure 3 for CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition
Figure 4 for CTL-MTNet: A Novel CapsNet and Transfer Learning-Based Mixed Task Net for the Single-Corpus and Cross-Corpus Speech Emotion Recognition
Viaarxiv icon

Multi-task self-supervised learning for Robust Speech Recognition

Add code
Bookmark button
Alert button
Jan 25, 2020
Mirco Ravanelli, Jianyuan Zhong, Santiago Pascual, Pawel Swietojanski, Joao Monteiro, Jan Trmal, Yoshua Bengio

Figure 1 for Multi-task self-supervised learning for Robust Speech Recognition
Figure 2 for Multi-task self-supervised learning for Robust Speech Recognition
Figure 3 for Multi-task self-supervised learning for Robust Speech Recognition
Figure 4 for Multi-task self-supervised learning for Robust Speech Recognition
Viaarxiv icon

VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge

Add code
Bookmark button
Alert button
Feb 20, 2023
Jaesung Huh, Andrew Brown, Jee-weon Jung, Joon Son Chung, Arsha Nagrani, Daniel Garcia-Romero, Andrew Zisserman

Figure 1 for VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Figure 2 for VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Figure 3 for VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Figure 4 for VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Viaarxiv icon

Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients

Add code
Bookmark button
Alert button
Nov 11, 2020
Huahuan Zheng, Keyu An, Zhijian Ou

Figure 1 for Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients
Figure 2 for Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients
Figure 3 for Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients
Figure 4 for Efficient Neural Architecture Search for End-to-end Speech Recognition via Straight-Through Gradients
Viaarxiv icon

XTREME-S: Evaluating Cross-lingual Speech Representations

Add code
Bookmark button
Alert button
Apr 13, 2022
Alexis Conneau, Ankur Bapna, Yu Zhang, Min Ma, Patrick von Platen, Anton Lozhkov, Colin Cherry, Ye Jia, Clara Rivera, Mihir Kale, Daan Van Esch, Vera Axelrod, Simran Khanuja, Jonathan H. Clark, Orhan Firat, Michael Auli, Sebastian Ruder, Jason Riesa, Melvin Johnson

Figure 1 for XTREME-S: Evaluating Cross-lingual Speech Representations
Figure 2 for XTREME-S: Evaluating Cross-lingual Speech Representations
Figure 3 for XTREME-S: Evaluating Cross-lingual Speech Representations
Figure 4 for XTREME-S: Evaluating Cross-lingual Speech Representations
Viaarxiv icon

Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models

Sep 17, 2019
Naoyuki Kanda, Shota Horiguchi, Yusuke Fujita, Yawen Xue, Kenji Nagamatsu, Shinji Watanabe

Figure 1 for Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models
Figure 2 for Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models
Figure 3 for Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models
Figure 4 for Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models
Viaarxiv icon

cross-modal fusion techniques for utterance-level emotion recognition from text and speech

Feb 05, 2023
Jiachen Luo, Huy Phan, Joshua Reiss

Figure 1 for cross-modal fusion techniques for utterance-level emotion recognition from text and speech
Figure 2 for cross-modal fusion techniques for utterance-level emotion recognition from text and speech
Figure 3 for cross-modal fusion techniques for utterance-level emotion recognition from text and speech
Figure 4 for cross-modal fusion techniques for utterance-level emotion recognition from text and speech
Viaarxiv icon