Alert button

"speech": models, code, and papers
Alert button

Scribosermo: Fast Speech-to-Text models for German and other Languages

Add code
Bookmark button
Alert button
Oct 15, 2021
Daniel Bermuth, Alexander Poeppel, Wolfgang Reif

Figure 1 for Scribosermo: Fast Speech-to-Text models for German and other Languages
Figure 2 for Scribosermo: Fast Speech-to-Text models for German and other Languages
Figure 3 for Scribosermo: Fast Speech-to-Text models for German and other Languages
Figure 4 for Scribosermo: Fast Speech-to-Text models for German and other Languages
Viaarxiv icon

Efficient Transformer for Direct Speech Translation

Add code
Bookmark button
Alert button
Jul 07, 2021
Belen Alastruey, Gerard I. Gállego, Marta R. Costa-jussà

Figure 1 for Efficient Transformer for Direct Speech Translation
Figure 2 for Efficient Transformer for Direct Speech Translation
Figure 3 for Efficient Transformer for Direct Speech Translation
Figure 4 for Efficient Transformer for Direct Speech Translation
Viaarxiv icon

Privacy attacks for automatic speech recognition acoustic models in a federated learning framework

Add code
Bookmark button
Alert button
Nov 06, 2021
Natalia Tomashenko, Salima Mdhaffar, Marc Tommasi, Yannick Estève, Jean-François Bonastre

Figure 1 for Privacy attacks for automatic speech recognition acoustic models in a federated learning framework
Figure 2 for Privacy attacks for automatic speech recognition acoustic models in a federated learning framework
Figure 3 for Privacy attacks for automatic speech recognition acoustic models in a federated learning framework
Figure 4 for Privacy attacks for automatic speech recognition acoustic models in a federated learning framework
Viaarxiv icon

Improving Channel Decorrelation for Multi-Channel Target Speech Extraction

Jun 06, 2021
Jiangyu Han, Wei Rao, Yannan Wang, Yanhua Long

Figure 1 for Improving Channel Decorrelation for Multi-Channel Target Speech Extraction
Figure 2 for Improving Channel Decorrelation for Multi-Channel Target Speech Extraction
Figure 3 for Improving Channel Decorrelation for Multi-Channel Target Speech Extraction
Figure 4 for Improving Channel Decorrelation for Multi-Channel Target Speech Extraction
Viaarxiv icon

The NTNU System for Formosa Speech Recognition Challenge 2020

Add code
Bookmark button
Alert button
Apr 14, 2021
Fu-An Chao, Tien-Hong Lo, Shi-Yan Weng, Shih-Hsuan Chiu, Yao-Ting Sung, Berlin Chen

Figure 1 for The NTNU System for Formosa Speech Recognition Challenge 2020
Figure 2 for The NTNU System for Formosa Speech Recognition Challenge 2020
Figure 3 for The NTNU System for Formosa Speech Recognition Challenge 2020
Figure 4 for The NTNU System for Formosa Speech Recognition Challenge 2020
Viaarxiv icon

UX-NET: Filter-and-Process-based Improved U-Net for Real-time Time-domain Audio Separation

Oct 28, 2022
Kashyap Patel, Anton Kovalyov, Issa Panahi

Figure 1 for UX-NET: Filter-and-Process-based Improved U-Net for Real-time Time-domain Audio Separation
Figure 2 for UX-NET: Filter-and-Process-based Improved U-Net for Real-time Time-domain Audio Separation
Figure 3 for UX-NET: Filter-and-Process-based Improved U-Net for Real-time Time-domain Audio Separation
Figure 4 for UX-NET: Filter-and-Process-based Improved U-Net for Real-time Time-domain Audio Separation
Viaarxiv icon

InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss

Add code
Bookmark button
Alert button
Nov 02, 2022
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe

Figure 1 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 2 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 3 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Figure 4 for InterMPL: Momentum Pseudo-Labeling with Intermediate CTC Loss
Viaarxiv icon

BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder

Add code
Bookmark button
Alert button
Nov 02, 2022
Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe

Figure 1 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 2 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 3 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Figure 4 for BECTRA: Transducer-based End-to-End ASR with BERT-Enhanced Encoder
Viaarxiv icon

Investigating data partitioning strategies for crosslinguistic low-resource ASR evaluation

Aug 26, 2022
Zoey Liu, Justin Spence, Emily Prud'hommeaux

Figure 1 for Investigating data partitioning strategies for crosslinguistic low-resource ASR evaluation
Figure 2 for Investigating data partitioning strategies for crosslinguistic low-resource ASR evaluation
Figure 3 for Investigating data partitioning strategies for crosslinguistic low-resource ASR evaluation
Figure 4 for Investigating data partitioning strategies for crosslinguistic low-resource ASR evaluation
Viaarxiv icon

Multilingual Speech Translation with Unified Transformer: Huawei Noah's Ark Lab at IWSLT 2021

Add code
Bookmark button
Alert button
Jun 22, 2021
Xingshan Zeng, Liangyou Li, Qun Liu

Figure 1 for Multilingual Speech Translation with Unified Transformer: Huawei Noah's Ark Lab at IWSLT 2021
Figure 2 for Multilingual Speech Translation with Unified Transformer: Huawei Noah's Ark Lab at IWSLT 2021
Figure 3 for Multilingual Speech Translation with Unified Transformer: Huawei Noah's Ark Lab at IWSLT 2021
Figure 4 for Multilingual Speech Translation with Unified Transformer: Huawei Noah's Ark Lab at IWSLT 2021
Viaarxiv icon