Alert button

"speech recognition": models, code, and papers
Alert button

Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation

Add code
Bookmark button
Alert button
Apr 18, 2022
Dan Berrebbi, Jiatong Shi, Brian Yan, Osbel Lopez-Francisco, Jonathan D. Amith, Shinji Watanabe

Figure 1 for Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Figure 2 for Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Figure 3 for Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Figure 4 for Combining Spectral and Self-Supervised Features for Low Resource Speech Recognition and Translation
Viaarxiv icon

Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study

Add code
Bookmark button
Alert button
Mar 12, 2023
Salah Zaiem, Robin Algayres, Titouan Parcollet, Slim Essid, Mirco Ravanelli

Figure 1 for Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study
Figure 2 for Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study
Figure 3 for Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study
Viaarxiv icon

Improving the Intent Classification accuracy in Noisy Environment

Mar 12, 2023
Mohamed Nabih Ali, Alessio Brutti, Daniele Falavigna

Figure 1 for Improving the Intent Classification accuracy in Noisy Environment
Figure 2 for Improving the Intent Classification accuracy in Noisy Environment
Figure 3 for Improving the Intent Classification accuracy in Noisy Environment
Figure 4 for Improving the Intent Classification accuracy in Noisy Environment
Viaarxiv icon

Private Language Model Adaptation for Speech Recognition

Sep 28, 2021
Zhe Liu, Ke Li, Shreyan Bakshi, Fuchun Peng

Figure 1 for Private Language Model Adaptation for Speech Recognition
Figure 2 for Private Language Model Adaptation for Speech Recognition
Figure 3 for Private Language Model Adaptation for Speech Recognition
Figure 4 for Private Language Model Adaptation for Speech Recognition
Viaarxiv icon

Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings

Add code
Bookmark button
Alert button
Jul 11, 2021
Chengrui Zhu, Keyu An, Huahuan Zheng, Zhijian Ou

Figure 1 for Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings
Figure 2 for Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings
Figure 3 for Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings
Figure 4 for Multilingual and crosslingual speech recognition using phonological-vector based phone embeddings
Viaarxiv icon

Lego-Features: Exporting modular encoder features for streaming and deliberation ASR

Mar 31, 2023
Rami Botros, Rohit Prabhavalkar, Johan Schalkwyk, Ciprian Chelba, Tara N. Sainath, Françoise Beaufays

Figure 1 for Lego-Features: Exporting modular encoder features for streaming and deliberation ASR
Figure 2 for Lego-Features: Exporting modular encoder features for streaming and deliberation ASR
Figure 3 for Lego-Features: Exporting modular encoder features for streaming and deliberation ASR
Figure 4 for Lego-Features: Exporting modular encoder features for streaming and deliberation ASR
Viaarxiv icon

LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion

Mar 02, 2023
Chunfeng Wang, Peisong Huang, Yuxiang Zou, Haoyu Zhang, Shichao Liu, Xiang Yin, Zejun Ma

Figure 1 for LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion
Figure 2 for LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion
Figure 3 for LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion
Figure 4 for LiteG2P: A fast, light and high accuracy model for grapheme-to-phoneme conversion
Viaarxiv icon

Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding

Add code
Bookmark button
Alert button
Feb 27, 2023
Yifan Peng, Kwangyoun Kim, Felix Wu, Prashant Sridhar, Shinji Watanabe

Figure 1 for Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding
Figure 2 for Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding
Figure 3 for Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding
Figure 4 for Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding
Viaarxiv icon

Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding

Add code
Bookmark button
Alert button
Jul 06, 2022
Yifan Peng, Siddharth Dalmia, Ian Lane, Shinji Watanabe

Figure 1 for Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Figure 2 for Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Figure 3 for Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Figure 4 for Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding
Viaarxiv icon

RescoreBERT: Discriminative Speech Recognition Rescoring with BERT

Feb 07, 2022
Liyan Xu, Yile Gu, Jari Kolehmainen, Haidar Khan, Ankur Gandhe, Ariya Rastrow, Andreas Stolcke, Ivan Bulyko

Figure 1 for RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
Figure 2 for RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
Figure 3 for RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
Figure 4 for RescoreBERT: Discriminative Speech Recognition Rescoring with BERT
Viaarxiv icon