Alert button

"speech recognition": models, code, and papers
Alert button

Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models

Dec 03, 2022
Reem Gody, David Harwath

Figure 1 for Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models
Figure 2 for Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models
Figure 3 for Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models
Figure 4 for Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models
Viaarxiv icon

Device Directedness with Contextual Cues for Spoken Dialog Systems

Nov 23, 2022
Dhanush Bekal, Sundararajan Srinivasan, Sravan Bodapati, Srikanth Ronanki, Katrin Kirchhoff

Figure 1 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Figure 2 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Figure 3 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Figure 4 for Device Directedness with Contextual Cues for Spoken Dialog Systems
Viaarxiv icon

Fast End-to-End Speech Recognition via a Non-Autoregressive Model and Cross-Modal Knowledge Transferring from BERT

Add code
Bookmark button
Alert button
Feb 16, 2021
Ye Bai, Jiangyan Yi, Jianhua Tao, Zhengkun Tian, Zhengqi Wen, Shuai Zhang

Figure 1 for Fast End-to-End Speech Recognition via a Non-Autoregressive Model and Cross-Modal Knowledge Transferring from BERT
Figure 2 for Fast End-to-End Speech Recognition via a Non-Autoregressive Model and Cross-Modal Knowledge Transferring from BERT
Figure 3 for Fast End-to-End Speech Recognition via a Non-Autoregressive Model and Cross-Modal Knowledge Transferring from BERT
Figure 4 for Fast End-to-End Speech Recognition via a Non-Autoregressive Model and Cross-Modal Knowledge Transferring from BERT
Viaarxiv icon

Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem

Oct 28, 2022
Sebastian P. Bayerl, Dominik Wagner, Florian Hönig, Tobias Bocklet, Elmar Nöth, Korbinian Riedhammer

Figure 1 for Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem
Figure 2 for Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem
Figure 3 for Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem
Viaarxiv icon

Robust Multi-channel Speech Recognition using Frequency Aligned Network

Feb 06, 2020
Taejin Park, Kenichi Kumatani, Minhua Wu, Shiva Sundaram

Figure 1 for Robust Multi-channel Speech Recognition using Frequency Aligned Network
Figure 2 for Robust Multi-channel Speech Recognition using Frequency Aligned Network
Figure 3 for Robust Multi-channel Speech Recognition using Frequency Aligned Network
Figure 4 for Robust Multi-channel Speech Recognition using Frequency Aligned Network
Viaarxiv icon

Time-frequency Network for Robust Speaker Recognition

Add code
Bookmark button
Alert button
Mar 07, 2023
Jiguo Li, Tianzi Zhang, Xiaobin Liu, Lirong Zheng

Figure 1 for Time-frequency Network for Robust Speaker Recognition
Figure 2 for Time-frequency Network for Robust Speaker Recognition
Figure 3 for Time-frequency Network for Robust Speaker Recognition
Figure 4 for Time-frequency Network for Robust Speaker Recognition
Viaarxiv icon

Efficient conformer: Progressive downsampling and grouped attention for automatic speech recognition

Add code
Bookmark button
Alert button
Sep 08, 2021
Maxime Burchi, Valentin Vielzeuf

Figure 1 for Efficient conformer: Progressive downsampling and grouped attention for automatic speech recognition
Figure 2 for Efficient conformer: Progressive downsampling and grouped attention for automatic speech recognition
Figure 3 for Efficient conformer: Progressive downsampling and grouped attention for automatic speech recognition
Figure 4 for Efficient conformer: Progressive downsampling and grouped attention for automatic speech recognition
Viaarxiv icon

Speech recognition for air traffic control via feature learning and end-to-end training

Nov 04, 2021
Peng Fan, Dongyue Guo, Yi Lin, Bo Yang, Jianwei Zhang

Figure 1 for Speech recognition for air traffic control via feature learning and end-to-end training
Figure 2 for Speech recognition for air traffic control via feature learning and end-to-end training
Figure 3 for Speech recognition for air traffic control via feature learning and end-to-end training
Figure 4 for Speech recognition for air traffic control via feature learning and end-to-end training
Viaarxiv icon

Simulating realistic speech overlaps improves multi-talker ASR

Nov 17, 2022
Muqiao Yang, Naoyuki Kanda, Xiaofei Wang, Jian Wu, Sunit Sivasankaran, Zhuo Chen, Jinyu Li, Takuya Yoshioka

Figure 1 for Simulating realistic speech overlaps improves multi-talker ASR
Figure 2 for Simulating realistic speech overlaps improves multi-talker ASR
Figure 3 for Simulating realistic speech overlaps improves multi-talker ASR
Figure 4 for Simulating realistic speech overlaps improves multi-talker ASR
Viaarxiv icon

SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing

Add code
Bookmark button
Alert button
Feb 27, 2023
Weidong Chen, Xiaofen Xing, Xiangmin Xu, Jianxin Pang, Lan Du

Figure 1 for SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Figure 2 for SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Figure 3 for SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Figure 4 for SpeechFormer++: A Hierarchical Efficient Framework for Paralinguistic Speech Processing
Viaarxiv icon