Alert button

"speech recognition": models, code, and papers
Alert button

A comparison of streaming models and data augmentation methods for robust speech recognition

Nov 19, 2021
Jiyeon Kim, Mehul Kumar, Dhananjaya Gowda, Abhinav Garg, Chanwoo Kim

Figure 1 for A comparison of streaming models and data augmentation methods for robust speech recognition
Figure 2 for A comparison of streaming models and data augmentation methods for robust speech recognition
Figure 3 for A comparison of streaming models and data augmentation methods for robust speech recognition
Figure 4 for A comparison of streaming models and data augmentation methods for robust speech recognition
Viaarxiv icon

Improving Massively Multilingual ASR With Auxiliary CTC Objectives

Add code
Bookmark button
Alert button
Feb 27, 2023
William Chen, Brian Yan, Jiatong Shi, Yifan Peng, Soumi Maiti, Shinji Watanabe

Figure 1 for Improving Massively Multilingual ASR With Auxiliary CTC Objectives
Figure 2 for Improving Massively Multilingual ASR With Auxiliary CTC Objectives
Figure 3 for Improving Massively Multilingual ASR With Auxiliary CTC Objectives
Figure 4 for Improving Massively Multilingual ASR With Auxiliary CTC Objectives
Viaarxiv icon

CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings

Add code
Bookmark button
Alert button
May 02, 2020
Shinji Watanabe, Michael Mandel, Jon Barker, Emmanuel Vincent, Ashish Arora, Xuankai Chang, Sanjeev Khudanpur, Vimal Manohar, Daniel Povey, Desh Raj, David Snyder, Aswin Shanmugam Subramanian, Jan Trmal, Bar Ben Yair, Christoph Boeddeker, Zhaoheng Ni, Yusuke Fujita, Shota Horiguchi, Naoyuki Kanda, Takuya Yoshioka, Neville Ryant

Figure 1 for CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings
Figure 2 for CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings
Figure 3 for CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings
Figure 4 for CHiME-6 Challenge:Tackling Multispeaker Speech Recognition for Unsegmented Recordings
Viaarxiv icon

WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition

Add code
Bookmark button
Alert button
Oct 18, 2021
Binbin Zhang, Hang Lv, Pengcheng Guo, Qijie Shao, Chao Yang, Lei Xie, Xin Xu, Hui Bu, Xiaoyu Chen, Chenchen Zeng, Di Wu, Zhendong Peng

Figure 1 for WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition
Figure 2 for WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition
Figure 3 for WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition
Figure 4 for WenetSpeech: A 10000+ Hours Multi-domain Mandarin Corpus for Speech Recognition
Viaarxiv icon

Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences

Add code
Bookmark button
Alert button
Mar 15, 2023
Yuan Tseng, Cheng-I Lai, Hung-yi Lee

Figure 1 for Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences
Figure 2 for Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences
Figure 3 for Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences
Figure 4 for Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences
Viaarxiv icon

Probing Speech Emotion Recognition Transformers for Linguistic Knowledge

Add code
Bookmark button
Alert button
Apr 01, 2022
Andreas Triantafyllopoulos, Johannes Wagner, Hagen Wierstorf, Maximilian Schmitt, Uwe Reichel, Florian Eyben, Felix Burkhardt, Björn W. Schuller

Figure 1 for Probing Speech Emotion Recognition Transformers for Linguistic Knowledge
Figure 2 for Probing Speech Emotion Recognition Transformers for Linguistic Knowledge
Figure 3 for Probing Speech Emotion Recognition Transformers for Linguistic Knowledge
Viaarxiv icon

The NTNU System for Formosa Speech Recognition Challenge 2020

Add code
Bookmark button
Alert button
Apr 14, 2021
Fu-An Chao, Tien-Hong Lo, Shi-Yan Weng, Shih-Hsuan Chiu, Yao-Ting Sung, Berlin Chen

Figure 1 for The NTNU System for Formosa Speech Recognition Challenge 2020
Figure 2 for The NTNU System for Formosa Speech Recognition Challenge 2020
Figure 3 for The NTNU System for Formosa Speech Recognition Challenge 2020
Figure 4 for The NTNU System for Formosa Speech Recognition Challenge 2020
Viaarxiv icon

Constrained Variational Autoencoder for improving EEG based Speech Recognition Systems

Jun 01, 2020
Gautam Krishna, Co Tran, Mason Carnahan, Ahmed Tewfik

Figure 1 for Constrained Variational Autoencoder for improving EEG based Speech Recognition Systems
Figure 2 for Constrained Variational Autoencoder for improving EEG based Speech Recognition Systems
Figure 3 for Constrained Variational Autoencoder for improving EEG based Speech Recognition Systems
Figure 4 for Constrained Variational Autoencoder for improving EEG based Speech Recognition Systems
Viaarxiv icon

Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition

Add code
Bookmark button
Alert button
Nov 03, 2022
Zengrui Jin, Xurong Xie, Mengzhe Geng, Tianzi Wang, Shujie Hu, Jiajun Deng, Guinan Li, Xunying Liu

Figure 1 for Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition
Figure 2 for Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition
Figure 3 for Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition
Figure 4 for Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition
Viaarxiv icon