Alert button

"speech recognition": models, code, and papers
Alert button

Cross-Language Speech Emotion Recognition Using Multimodal Dual Attention Transformers

Jul 14, 2023
Syed Aun Muhammad Zaidi, Siddique Latif, Junaid Qadir

Figure 1 for Cross-Language Speech Emotion Recognition Using Multimodal Dual Attention Transformers
Figure 2 for Cross-Language Speech Emotion Recognition Using Multimodal Dual Attention Transformers
Figure 3 for Cross-Language Speech Emotion Recognition Using Multimodal Dual Attention Transformers
Figure 4 for Cross-Language Speech Emotion Recognition Using Multimodal Dual Attention Transformers
Viaarxiv icon

Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding

Apr 21, 2023
Mohan Li, Rama Doddipatla

Figure 1 for Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding
Figure 2 for Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding
Figure 3 for Non-autoregressive End-to-end Approaches for Joint Automatic Speech Recognition and Spoken Language Understanding
Viaarxiv icon

From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition

Jan 19, 2023
Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Rohit Prabhavalkar, Tara N. Sainath, Trevor Strohman

Figure 1 for From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition
Figure 2 for From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition
Figure 3 for From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition
Figure 4 for From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition
Viaarxiv icon

A Deep Learning System for Domain-specific speech Recognition

Mar 18, 2023
Yanan Jia

Figure 1 for A Deep Learning System for Domain-specific speech Recognition
Figure 2 for A Deep Learning System for Domain-specific speech Recognition
Figure 3 for A Deep Learning System for Domain-specific speech Recognition
Figure 4 for A Deep Learning System for Domain-specific speech Recognition
Viaarxiv icon

Improving CTC-AED model with integrated-CTC and auxiliary loss regularization

Aug 15, 2023
Daobin Zhu, Xiangdong Su, Hongbin Zhang

Figure 1 for Improving CTC-AED model with integrated-CTC and auxiliary loss regularization
Figure 2 for Improving CTC-AED model with integrated-CTC and auxiliary loss regularization
Figure 3 for Improving CTC-AED model with integrated-CTC and auxiliary loss regularization
Figure 4 for Improving CTC-AED model with integrated-CTC and auxiliary loss regularization
Viaarxiv icon

Speech Diarization and ASR with GMM

Jul 11, 2023
Aayush Kumar Sharma, Vineet Bhavikatti, Amogh Nidawani, Dr. Siddappaji, Sanath P, Dr Geetishree Mishra

Figure 1 for Speech Diarization and ASR with GMM
Figure 2 for Speech Diarization and ASR with GMM
Figure 3 for Speech Diarization and ASR with GMM
Viaarxiv icon

TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models

Sep 05, 2023
Yuan Shangguan, Haichuan Yang, Danni Li, Chunyang Wu, Yassir Fathullah, Dilin Wang, Ayushi Dalmia, Raghuraman Krishnamoorthi, Ozlem Kalinli, Junteng Jia, Jay Mahadeokar, Xin Lei, Mike Seltzer, Vikas Chandra

Figure 1 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 2 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 3 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 4 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Viaarxiv icon

Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers

Add code
Bookmark button
Alert button
Jul 06, 2023
Yuan Gong, Sameer Khurana, Leonid Karlinsky, James Glass

Figure 1 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 2 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 3 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Figure 4 for Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong General Audio Event Taggers
Viaarxiv icon

AdVerb: Visually Guided Audio Dereverberation

Aug 23, 2023
Sanjoy Chowdhury, Sreyan Ghosh, Subhrajyoti Dasgupta, Anton Ratnarajah, Utkarsh Tyagi, Dinesh Manocha

Figure 1 for AdVerb: Visually Guided Audio Dereverberation
Figure 2 for AdVerb: Visually Guided Audio Dereverberation
Figure 3 for AdVerb: Visually Guided Audio Dereverberation
Figure 4 for AdVerb: Visually Guided Audio Dereverberation
Viaarxiv icon

Evaluating Automatic Speech Recognition in an Incremental Setting

Add code
Bookmark button
Alert button
Feb 23, 2023
Ryan Whetten, Mir Tahsin Imtiaz, Casey Kennington

Figure 1 for Evaluating Automatic Speech Recognition in an Incremental Setting
Figure 2 for Evaluating Automatic Speech Recognition in an Incremental Setting
Figure 3 for Evaluating Automatic Speech Recognition in an Incremental Setting
Figure 4 for Evaluating Automatic Speech Recognition in an Incremental Setting
Viaarxiv icon