Alert button

"speech recognition": models, code, and papers
Alert button

Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model

Apr 06, 2021
Apoorv Vyas, Srikanth Madikeri, Hervé Bourlard

Figure 1 for Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model
Figure 2 for Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model
Figure 3 for Comparing CTC and LFMMI for out-of-domain adaptation of wav2vec 2.0 acoustic model
Viaarxiv icon

Optimal Transport-based Adaptation in Dysarthric Speech Tasks

Apr 06, 2021
Rosanna Turrisi, Leonardo Badino

Figure 1 for Optimal Transport-based Adaptation in Dysarthric Speech Tasks
Figure 2 for Optimal Transport-based Adaptation in Dysarthric Speech Tasks
Figure 3 for Optimal Transport-based Adaptation in Dysarthric Speech Tasks
Figure 4 for Optimal Transport-based Adaptation in Dysarthric Speech Tasks
Viaarxiv icon

Software Engineering for AI-Based Systems: A Survey

Add code
Bookmark button
Alert button
May 05, 2021
Silverio Martínez-Fernández, Justus Bogner, Xavier Franch, Marc Oriol, Julien Siebert, Adam Trendowicz, Anna Maria Vollmer, Stefan Wagner

Figure 1 for Software Engineering for AI-Based Systems: A Survey
Figure 2 for Software Engineering for AI-Based Systems: A Survey
Figure 3 for Software Engineering for AI-Based Systems: A Survey
Figure 4 for Software Engineering for AI-Based Systems: A Survey
Viaarxiv icon

Multi-Scale Temporal Convolution Network for Classroom Voice Detection

May 31, 2021
Lu Ma, Xintian Wang, Song Yang, Yaguang Gong, Zhongqin Wu

Figure 1 for Multi-Scale Temporal Convolution Network for Classroom Voice Detection
Figure 2 for Multi-Scale Temporal Convolution Network for Classroom Voice Detection
Figure 3 for Multi-Scale Temporal Convolution Network for Classroom Voice Detection
Figure 4 for Multi-Scale Temporal Convolution Network for Classroom Voice Detection
Viaarxiv icon

Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning

Feb 01, 2020
Sanna Wager, Aparna Khare, Minhua Wu, Kenichi Kumatani, Shiva Sundaram

Figure 1 for Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning
Figure 2 for Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning
Figure 3 for Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning
Figure 4 for Fully Learnable Front-End for Multi-Channel Acoustic Modeling using Semi-Supervised Learning
Viaarxiv icon

CoVoST 2: A Massively Multilingual Speech-to-Text Translation Corpus

Add code
Bookmark button
Alert button
Jul 20, 2020
Changhan Wang, Anne Wu, Juan Pino

Figure 1 for CoVoST 2: A Massively Multilingual Speech-to-Text Translation Corpus
Figure 2 for CoVoST 2: A Massively Multilingual Speech-to-Text Translation Corpus
Figure 3 for CoVoST 2: A Massively Multilingual Speech-to-Text Translation Corpus
Figure 4 for CoVoST 2: A Massively Multilingual Speech-to-Text Translation Corpus
Viaarxiv icon

Zero-shot Speech Translation

Jul 13, 2021
Tu Anh Dinh

Figure 1 for Zero-shot Speech Translation
Figure 2 for Zero-shot Speech Translation
Figure 3 for Zero-shot Speech Translation
Figure 4 for Zero-shot Speech Translation
Viaarxiv icon

LSTM and GPT-2 Synthetic Speech Transfer Learning for Speaker Recognition to Overcome Data Scarcity

Jul 03, 2020
Jordan J. Bird, Diego R. Faria, Anikó Ekárt, Cristiano Premebida, Pedro P. S. Ayrosa

Figure 1 for LSTM and GPT-2 Synthetic Speech Transfer Learning for Speaker Recognition to Overcome Data Scarcity
Figure 2 for LSTM and GPT-2 Synthetic Speech Transfer Learning for Speaker Recognition to Overcome Data Scarcity
Figure 3 for LSTM and GPT-2 Synthetic Speech Transfer Learning for Speaker Recognition to Overcome Data Scarcity
Figure 4 for LSTM and GPT-2 Synthetic Speech Transfer Learning for Speaker Recognition to Overcome Data Scarcity
Viaarxiv icon

Towards Lifelong Learning of End-to-end ASR

Apr 04, 2021
Heng-Jui Chang, Hung-yi Lee, Lin-shan Lee

Figure 1 for Towards Lifelong Learning of End-to-end ASR
Figure 2 for Towards Lifelong Learning of End-to-end ASR
Figure 3 for Towards Lifelong Learning of End-to-end ASR
Figure 4 for Towards Lifelong Learning of End-to-end ASR
Viaarxiv icon

Persian phonemes recognition using PPNet

Dec 17, 2018
Saber Malekzadeh, Mohammad Hossein Gholizadeh, Seyed Naser Razavi

Figure 1 for Persian phonemes recognition using PPNet
Figure 2 for Persian phonemes recognition using PPNet
Figure 3 for Persian phonemes recognition using PPNet
Figure 4 for Persian phonemes recognition using PPNet
Viaarxiv icon