Alert button

"speech": models, code, and papers
Alert button

Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages

Jul 03, 2023
Devang Kulshreshtha, Saket Dingliwal, Brady Houston, Sravan Bodapati

Figure 1 for Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages
Figure 2 for Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages
Figure 3 for Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages
Viaarxiv icon

ACO-tagger: A Novel Method for Part-of-Speech Tagging using Ant Colony Optimization

Mar 27, 2023
Amirhossein Mohammadi, Sara Hajiaghajani, Mohammad Bahrani

Figure 1 for ACO-tagger: A Novel Method for Part-of-Speech Tagging using Ant Colony Optimization
Figure 2 for ACO-tagger: A Novel Method for Part-of-Speech Tagging using Ant Colony Optimization
Figure 3 for ACO-tagger: A Novel Method for Part-of-Speech Tagging using Ant Colony Optimization
Figure 4 for ACO-tagger: A Novel Method for Part-of-Speech Tagging using Ant Colony Optimization
Viaarxiv icon

EM-Network: Oracle Guided Self-distillation for Sequence Learning

Jun 14, 2023
Ji Won Yoon, Sunghwan Ahn, Hyeonseung Lee, Minchan Kim, Seok Min Kim, Nam Soo Kim

Figure 1 for EM-Network: Oracle Guided Self-distillation for Sequence Learning
Figure 2 for EM-Network: Oracle Guided Self-distillation for Sequence Learning
Figure 3 for EM-Network: Oracle Guided Self-distillation for Sequence Learning
Figure 4 for EM-Network: Oracle Guided Self-distillation for Sequence Learning
Viaarxiv icon

Investigating Pre-trained Audio Encoders in the Low-Resource Condition

May 28, 2023
Hao Yang, Jinming Zhao, Gholamreza Haffari, Ehsan Shareghi

Figure 1 for Investigating Pre-trained Audio Encoders in the Low-Resource Condition
Figure 2 for Investigating Pre-trained Audio Encoders in the Low-Resource Condition
Figure 3 for Investigating Pre-trained Audio Encoders in the Low-Resource Condition
Figure 4 for Investigating Pre-trained Audio Encoders in the Low-Resource Condition
Viaarxiv icon

Conformers are All You Need for Visual Speech Recogntion

Feb 17, 2023
Oscar Chang, Hank Liao, Dmitriy Serdyuk, Ankit Shah, Olivier Siohan

Figure 1 for Conformers are All You Need for Visual Speech Recogntion
Figure 2 for Conformers are All You Need for Visual Speech Recogntion
Figure 3 for Conformers are All You Need for Visual Speech Recogntion
Figure 4 for Conformers are All You Need for Visual Speech Recogntion
Viaarxiv icon

High-Fidelity Audio Compression with Improved RVQGAN

Jun 11, 2023
Rithesh Kumar, Prem Seetharaman, Alejandro Luebs, Ishaan Kumar, Kundan Kumar

Figure 1 for High-Fidelity Audio Compression with Improved RVQGAN
Figure 2 for High-Fidelity Audio Compression with Improved RVQGAN
Figure 3 for High-Fidelity Audio Compression with Improved RVQGAN
Figure 4 for High-Fidelity Audio Compression with Improved RVQGAN
Viaarxiv icon

Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition

Apr 24, 2023
Mohan Li, Rama Doddipatla, Catalin Zorila

Figure 1 for Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Figure 2 for Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Figure 3 for Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Figure 4 for Self-regularised Minimum Latency Training for Streaming Transformer-based Speech Recognition
Viaarxiv icon

deHuBERT: Disentangling Noise in a Self-supervised Model for Robust Speech Recognition

Feb 28, 2023
Dianwen Ng, Ruixi Zhang, Jia Qi Yip, Zhao Yang, Jinjie Ni, Chong Zhang, Yukun Ma, Chongjia Ni, Eng Siong Chng, Bin Ma

Figure 1 for deHuBERT: Disentangling Noise in a Self-supervised Model for Robust Speech Recognition
Figure 2 for deHuBERT: Disentangling Noise in a Self-supervised Model for Robust Speech Recognition
Figure 3 for deHuBERT: Disentangling Noise in a Self-supervised Model for Robust Speech Recognition
Figure 4 for deHuBERT: Disentangling Noise in a Self-supervised Model for Robust Speech Recognition
Viaarxiv icon

A Comparison of Speech Data Augmentation Methods Using S3PRL Toolkit

Feb 27, 2023
Mina Huh, Ruchira Ray, Corey Karnei

Figure 1 for A Comparison of Speech Data Augmentation Methods Using S3PRL Toolkit
Figure 2 for A Comparison of Speech Data Augmentation Methods Using S3PRL Toolkit
Figure 3 for A Comparison of Speech Data Augmentation Methods Using S3PRL Toolkit
Figure 4 for A Comparison of Speech Data Augmentation Methods Using S3PRL Toolkit
Viaarxiv icon

Bridging the Granularity Gap for Acoustic Modeling

May 27, 2023
Chen Xu, Yuhao Zhang, Chengbo Jiao, Xiaoqian Liu, Chi Hu, Xin Zeng, Tong Xiao, Anxiang Ma, Huizhen Wang, JingBo Zhu

Figure 1 for Bridging the Granularity Gap for Acoustic Modeling
Figure 2 for Bridging the Granularity Gap for Acoustic Modeling
Figure 3 for Bridging the Granularity Gap for Acoustic Modeling
Figure 4 for Bridging the Granularity Gap for Acoustic Modeling
Viaarxiv icon