Alert button

"speech recognition": models, code, and papers
Alert button

Speaker diarization assisted ASR for multi-speaker conversations

Apr 05, 2021
Srikanth Raj Chetupalli, Sriram Ganapathy

Figure 1 for Speaker diarization assisted ASR for multi-speaker conversations
Figure 2 for Speaker diarization assisted ASR for multi-speaker conversations
Figure 3 for Speaker diarization assisted ASR for multi-speaker conversations
Figure 4 for Speaker diarization assisted ASR for multi-speaker conversations
Viaarxiv icon

Google Crowdsourced Speech Corpora and Related Open-Source Resources for Low-Resource Languages and Dialects: An Overview

Add code
Bookmark button
Alert button
Oct 14, 2020
Alena Butryna, Shan-Hui Cathy Chu, Isin Demirsahin, Alexander Gutkin, Linne Ha, Fei He, Martin Jansche, Cibu Johny, Anna Katanova, Oddur Kjartansson, Chenfang Li, Tatiana Merkulova, Yin May Oo, Knot Pipatsrisawat, Clara Rivera, Supheakmungkol Sarin, Pasindu de Silva, Keshan Sodimana, Richard Sproat, Theeraphol Wattanavekin, Jaka Aris Eko Wibawa

Viaarxiv icon

Spartus: A 9.4 TOp/s FPGA-based LSTM Accelerator Exploiting Spatio-temporal Sparsity

Add code
Bookmark button
Alert button
Aug 20, 2021
Chang Gao, Tobi Delbruck, Shih-Chii Liu

Figure 1 for Spartus: A 9.4 TOp/s FPGA-based LSTM Accelerator Exploiting Spatio-temporal Sparsity
Figure 2 for Spartus: A 9.4 TOp/s FPGA-based LSTM Accelerator Exploiting Spatio-temporal Sparsity
Figure 3 for Spartus: A 9.4 TOp/s FPGA-based LSTM Accelerator Exploiting Spatio-temporal Sparsity
Figure 4 for Spartus: A 9.4 TOp/s FPGA-based LSTM Accelerator Exploiting Spatio-temporal Sparsity
Viaarxiv icon

What can predictive speech coders learn from speaker recognizers?

Apr 05, 2022
Marcos Faundez-Zanuy

Figure 1 for What can predictive speech coders learn from speaker recognizers?
Figure 2 for What can predictive speech coders learn from speaker recognizers?
Figure 3 for What can predictive speech coders learn from speaker recognizers?
Figure 4 for What can predictive speech coders learn from speaker recognizers?
Viaarxiv icon

Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices

Jan 18, 2021
Yuekai Zhang, Sining Sun, Long Ma

Figure 1 for Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices
Figure 2 for Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices
Figure 3 for Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices
Figure 4 for Tiny Transducer: A Highly-efficient Speech Recognition Model on Edge Devices
Viaarxiv icon

Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations

Add code
Bookmark button
Alert button
Jul 26, 2021
Se-Yun Um, Jihyun Kim, Jihyun Lee, Sangshin Oh, Kyungguen Byun, Hong-Goo Kang

Figure 1 for Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations
Figure 2 for Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations
Figure 3 for Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations
Figure 4 for Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations
Viaarxiv icon

A Unified Speaker Adaptation Approach for ASR

Add code
Bookmark button
Alert button
Oct 16, 2021
Yingzhu Zhao, Chongjia Ni, Cheung-Chi Leung, Shafiq Joty, Eng Siong Chng, Bin Ma

Figure 1 for A Unified Speaker Adaptation Approach for ASR
Figure 2 for A Unified Speaker Adaptation Approach for ASR
Figure 3 for A Unified Speaker Adaptation Approach for ASR
Figure 4 for A Unified Speaker Adaptation Approach for ASR
Viaarxiv icon

Modified SPLICE and its Extension to Non-Stereo Data for Noise Robust Speech Recognition

Jul 15, 2013
D. S. Pavan Kumar, N. Vishnu Prasad, Vikas Joshi, S. Umesh

Figure 1 for Modified SPLICE and its Extension to Non-Stereo Data for Noise Robust Speech Recognition
Figure 2 for Modified SPLICE and its Extension to Non-Stereo Data for Noise Robust Speech Recognition
Figure 3 for Modified SPLICE and its Extension to Non-Stereo Data for Noise Robust Speech Recognition
Figure 4 for Modified SPLICE and its Extension to Non-Stereo Data for Noise Robust Speech Recognition
Viaarxiv icon

Multi-mode Transformer Transducer with Stochastic Future Context

Jun 17, 2021
Kwangyoun Kim, Felix Wu, Prashant Sridhar, Kyu J. Han, Shinji Watanabe

Figure 1 for Multi-mode Transformer Transducer with Stochastic Future Context
Figure 2 for Multi-mode Transformer Transducer with Stochastic Future Context
Figure 3 for Multi-mode Transformer Transducer with Stochastic Future Context
Viaarxiv icon

Challenges and Obstacles Towards Deploying Deep Learning Models on Mobile Devices

May 06, 2021
Hamid Tabani, Ajay Balasubramaniam, Elahe Arani, Bahram Zonooz

Figure 1 for Challenges and Obstacles Towards Deploying Deep Learning Models on Mobile Devices
Figure 2 for Challenges and Obstacles Towards Deploying Deep Learning Models on Mobile Devices
Figure 3 for Challenges and Obstacles Towards Deploying Deep Learning Models on Mobile Devices
Figure 4 for Challenges and Obstacles Towards Deploying Deep Learning Models on Mobile Devices
Viaarxiv icon