Alert button

"speech recognition": models, code, and papers
Alert button

RedPenNet for Grammatical Error Correction: Outputs to Tokens, Attentions to Spans

Add code
Bookmark button
Alert button
Sep 19, 2023
Bohdan Didenko, Andrii Sameliuk

Viaarxiv icon

Personalization of CTC Speech Recognition Models

Oct 18, 2022
Saket Dingliwal, Monica Sunkara, Srikanth Ronanki, Jeff Farris, Katrin Kirchhoff, Sravan Bodapati

Figure 1 for Personalization of CTC Speech Recognition Models
Figure 2 for Personalization of CTC Speech Recognition Models
Figure 3 for Personalization of CTC Speech Recognition Models
Figure 4 for Personalization of CTC Speech Recognition Models
Viaarxiv icon

Learning Robust Self-attention Features for Speech Emotion Recognition with Label-adaptive Mixup

Add code
Bookmark button
Alert button
May 07, 2023
Lei Kang, Lichao Zhang, Dazhi Jiang

Figure 1 for Learning Robust Self-attention Features for Speech Emotion Recognition with Label-adaptive Mixup
Figure 2 for Learning Robust Self-attention Features for Speech Emotion Recognition with Label-adaptive Mixup
Figure 3 for Learning Robust Self-attention Features for Speech Emotion Recognition with Label-adaptive Mixup
Figure 4 for Learning Robust Self-attention Features for Speech Emotion Recognition with Label-adaptive Mixup
Viaarxiv icon

A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos

Add code
Bookmark button
Alert button
Jul 20, 2023
Anand Kumar Rai, Siddharth D Jaiswal, Animesh Mukherjee

Figure 1 for A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos
Figure 2 for A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos
Figure 3 for A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos
Figure 4 for A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos
Viaarxiv icon

MASR: Metadata Aware Speech Representation

Jul 20, 2023
Anjali Raj, Shikhar Bharadwaj, Sriram Ganapathy, Min Ma, Shikhar Vashishth

Figure 1 for MASR: Metadata Aware Speech Representation
Figure 2 for MASR: Metadata Aware Speech Representation
Figure 3 for MASR: Metadata Aware Speech Representation
Figure 4 for MASR: Metadata Aware Speech Representation
Viaarxiv icon

Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition

Sep 30, 2022
Chendong Zhao, Jianzong Wang, Wen qi Wei, Xiaoyang Qu, Haoqian Wang, Jing Xiao

Figure 1 for Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition
Figure 2 for Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition
Figure 3 for Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition
Figure 4 for Adaptive Sparse and Monotonic Attention for Transformer-based Automatic Speech Recognition
Viaarxiv icon

Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures

Add code
Bookmark button
Alert button
Jul 27, 2023
Kun Yuan, Vinkle Srivastav, Tong Yu, Joel Lavanchy, Pietro Mascagni, Nassir Navab, Nicolas Padoy

Figure 1 for Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures
Figure 2 for Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures
Figure 3 for Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures
Figure 4 for Learning Multi-modal Representations by Watching Hundreds of Surgical Video Lectures
Viaarxiv icon

Scaling Speech Technology to 1,000+ Languages

Add code
Bookmark button
Alert button
May 22, 2023
Vineel Pratap, Andros Tjandra, Bowen Shi, Paden Tomasello, Arun Babu, Sayani Kundu, Ali Elkahky, Zhaoheng Ni, Apoorv Vyas, Maryam Fazel-Zarandi, Alexei Baevski, Yossi Adi, Xiaohui Zhang, Wei-Ning Hsu, Alexis Conneau, Michael Auli

Figure 1 for Scaling Speech Technology to 1,000+ Languages
Figure 2 for Scaling Speech Technology to 1,000+ Languages
Figure 3 for Scaling Speech Technology to 1,000+ Languages
Figure 4 for Scaling Speech Technology to 1,000+ Languages
Viaarxiv icon

Progressive Multi-Scale Self-Supervised Learning for Speech Recognition

Dec 07, 2022
Genshun Wan, Tan Liu, Hang Chen, Jia Pan, Cong Liu, Zhongfu Ye

Figure 1 for Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Figure 2 for Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Figure 3 for Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Figure 4 for Progressive Multi-Scale Self-Supervised Learning for Speech Recognition
Viaarxiv icon

Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization

Add code
Bookmark button
Alert button
May 18, 2023
Puyuan Peng, Brian Yan, Shinji Watanabe, David Harwath

Figure 1 for Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization
Figure 2 for Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization
Figure 3 for Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization
Figure 4 for Prompting the Hidden Talent of Web-Scale Speech Models for Zero-Shot Task Generalization
Viaarxiv icon