Alert button

"speech recognition": models, code, and papers
Alert button

Multi-channel Conversational Speaker Separation via Neural Diarization

Nov 15, 2023
Hassan Taherian, DeLiang Wang

Figure 1 for Multi-channel Conversational Speaker Separation via Neural Diarization
Figure 2 for Multi-channel Conversational Speaker Separation via Neural Diarization
Figure 3 for Multi-channel Conversational Speaker Separation via Neural Diarization
Figure 4 for Multi-channel Conversational Speaker Separation via Neural Diarization
Viaarxiv icon

Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure

Jun 14, 2023
Weidong Ji, Shijie Zan, Guohui Zhou, Xu Wang

Figure 1 for Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure
Figure 2 for Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure
Figure 3 for Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure
Figure 4 for Research on an improved Conformer end-to-end Speech Recognition Model with R-Drop Structure
Viaarxiv icon

Whisper in Focus: Enhancing Stuttered Speech Classification with Encoder Layer Optimization

Nov 09, 2023
Huma Ameer, Seemab Latif, Rabia Latif, Sana Mukhtar

Viaarxiv icon

Design, construction and evaluation of emotional multimodal pathological speech database

Dec 14, 2023
Ting Zhu, Shufei Duan, Huizhi Liang, Wei Zhang

Figure 1 for Design, construction and evaluation of emotional multimodal pathological speech database
Figure 2 for Design, construction and evaluation of emotional multimodal pathological speech database
Figure 3 for Design, construction and evaluation of emotional multimodal pathological speech database
Figure 4 for Design, construction and evaluation of emotional multimodal pathological speech database
Viaarxiv icon

Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network

May 21, 2023
Kaixun Huang, Ao Zhang, Zhanheng Yang, Pengcheng Guo, Bingshen Mu, Tianyi Xu, Lei Xie

Figure 1 for Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network
Figure 2 for Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network
Figure 3 for Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network
Figure 4 for Contextualized End-to-End Speech Recognition with Contextual Phrase Prediction Network
Viaarxiv icon

Parameter-efficient Dysarthric Speech Recognition Using Adapter Fusion and Householder Transformation

Jun 12, 2023
Jinzi Qi, Hugo Van hamme

Figure 1 for Parameter-efficient Dysarthric Speech Recognition Using Adapter Fusion and Householder Transformation
Figure 2 for Parameter-efficient Dysarthric Speech Recognition Using Adapter Fusion and Householder Transformation
Figure 3 for Parameter-efficient Dysarthric Speech Recognition Using Adapter Fusion and Householder Transformation
Viaarxiv icon

Towards End-to-End Spoken Grammatical Error Correction

Nov 09, 2023
Stefano Bannò, Rao Ma, Mengjie Qian, Kate M. Knill, Mark J. F. Gales

Viaarxiv icon

Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture

Jul 05, 2023
Haoran Miao, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan

Figure 1 for Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture
Figure 2 for Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture
Figure 3 for Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture
Figure 4 for Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture
Viaarxiv icon

Transsion TSUP's speech recognition system for ASRU 2023 MADASR Challenge

Jul 20, 2023
Xiaoxiao Li, Gaosheng Zhang, An Zhu, Weiyong Li, Shuming Fang, Xiaoyue Yang, Jianchao Zhu

Viaarxiv icon

Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study

Jul 13, 2023
Zeping Min, Jinbo Wang

Figure 1 for Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Figure 2 for Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Figure 3 for Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Figure 4 for Exploring the Integration of Large Language Models into Automatic Speech Recognition Systems: An Empirical Study
Viaarxiv icon