Alert button

"speech recognition": models, code, and papers
Alert button

DistilWhisper: Efficient Distillation of Multi-task Speech Models via Language-Specific Experts

Nov 02, 2023
Thomas Palmeira Ferraz, Marcely Zanon Boito, Caroline Brun, Vassilina Nikoulina

Viaarxiv icon

Server-side Rescoring of Spoken Entity-centric Knowledge Queries for Virtual Assistants

Nov 02, 2023
Youyuan Zhang, Sashank Gondala, Thiago Fraga-Silva, Christophe Van Gysel

Viaarxiv icon

Augmenty: A Python Library for Structured Text Augmentation

Dec 09, 2023
Kenneth Enevoldsen

Viaarxiv icon

Parameter-efficient Dysarthric Speech Recognition Using Adapter Fusion and Householder Transformation

Jun 12, 2023
Jinzi Qi, Hugo Van hamme

Figure 1 for Parameter-efficient Dysarthric Speech Recognition Using Adapter Fusion and Householder Transformation
Figure 2 for Parameter-efficient Dysarthric Speech Recognition Using Adapter Fusion and Householder Transformation
Figure 3 for Parameter-efficient Dysarthric Speech Recognition Using Adapter Fusion and Householder Transformation
Viaarxiv icon

Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network

May 21, 2023
Kaixun Huang, Ao Zhang, Zhanheng Yang, Pengcheng Guo, Bingshen Mu, Tianyi Xu, Lei Xie

Figure 1 for Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network
Figure 2 for Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network
Figure 3 for Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network
Figure 4 for Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network
Viaarxiv icon

Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture

Jul 05, 2023
Haoran Miao, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan

Figure 1 for Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture
Figure 2 for Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture
Figure 3 for Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture
Figure 4 for Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture
Viaarxiv icon

Transsion TSUP's speech recognition system for ASRU 2023 MADASR Challenge

Jul 20, 2023
Xiaoxiao Li, Gaosheng Zhang, An Zhu, Weiyong Li, Shuming Fang, Xiaoyue Yang, Jianchao Zhu

Viaarxiv icon

Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study

Jul 13, 2023
Zeping Min, Jinbo Wang

Figure 1 for Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Figure 2 for Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Figure 3 for Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Figure 4 for Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Viaarxiv icon

Careful Whisper -- leveraging advances in automatic speech recognition for robust and interpretable aphasia subtype classification

Aug 02, 2023
Laurin Wagner, Mario Zusag, Theresa Bloder

Figure 1 for Careful Whisper -- leveraging advances in automatic speech recognition for robust and interpretable aphasia subtype classification
Figure 2 for Careful Whisper -- leveraging advances in automatic speech recognition for robust and interpretable aphasia subtype classification
Figure 3 for Careful Whisper -- leveraging advances in automatic speech recognition for robust and interpretable aphasia subtype classification
Figure 4 for Careful Whisper -- leveraging advances in automatic speech recognition for robust and interpretable aphasia subtype classification
Viaarxiv icon

Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation

Oct 23, 2023
Sara Papi, Peidong Wang, Junkun Chen, Jian Xue, Naoyuki Kanda, Jinyu Li, Yashesh Gaur

Viaarxiv icon