Alert button

"speech recognition": models, code, and papers
Alert button

Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset

Add code
Bookmark button
Alert button
Sep 11, 2022
H. A. Z. Sameen Shahgir, Khondker Salman Sayeed, Tanjeem Azwad Zaman

Figure 1 for Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset
Figure 2 for Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset
Figure 3 for Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset
Figure 4 for Applying wav2vec2 for Speech Recognition on Bengali Common Voices Dataset
Viaarxiv icon

Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition

Jul 07, 2022
Muhammad Umar Farooq, Thomas Hain

Figure 1 for Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition
Figure 2 for Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition
Figure 3 for Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition
Figure 4 for Investigating the Impact of Cross-lingual Acoustic-Phonetic Similarities on Multilingual Speech Recognition
Viaarxiv icon

Robustness of Multi-Source MT to Transcription Errors

Add code
Bookmark button
Alert button
May 26, 2023
Dominik Macháček, Peter Polák, Ondřej Bojar, Raj Dabre

Figure 1 for Robustness of Multi-Source MT to Transcription Errors
Figure 2 for Robustness of Multi-Source MT to Transcription Errors
Figure 3 for Robustness of Multi-Source MT to Transcription Errors
Figure 4 for Robustness of Multi-Source MT to Transcription Errors
Viaarxiv icon

FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition

Add code
Bookmark button
Alert button
Oct 31, 2022
Xingchen Song, Di Wu, Binbin Zhang, Zhiyong Wu, Wenpeng Li, Dongfang Li, Pengshen Zhang, Zhendong Peng, Fuping Pan, Changbao Zhu, Zhongqin Wu

Figure 1 for FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition
Figure 2 for FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition
Figure 3 for FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition
Figure 4 for FusionFormer: Fusing Operations in Transformer for Efficient Streaming Speech Recognition
Viaarxiv icon

Learning to Jointly Transcribe and Subtitle for End-to-End Spontaneous Speech Recognition

Oct 14, 2022
Jakob Poncelet, Hugo Van hamme

Figure 1 for Learning to Jointly Transcribe and Subtitle for End-to-End Spontaneous Speech Recognition
Figure 2 for Learning to Jointly Transcribe and Subtitle for End-to-End Spontaneous Speech Recognition
Figure 3 for Learning to Jointly Transcribe and Subtitle for End-to-End Spontaneous Speech Recognition
Figure 4 for Learning to Jointly Transcribe and Subtitle for End-to-End Spontaneous Speech Recognition
Viaarxiv icon

FonMTL: Towards Multitask Learning for the Fon Language

Add code
Bookmark button
Alert button
Aug 28, 2023
Bonaventure F. P. Dossou, Iffanice Houndayi, Pamely Zantou, Gilles Hacheme

Figure 1 for FonMTL: Towards Multitask Learning for the Fon Language
Figure 2 for FonMTL: Towards Multitask Learning for the Fon Language
Figure 3 for FonMTL: Towards Multitask Learning for the Fon Language
Figure 4 for FonMTL: Towards Multitask Learning for the Fon Language
Viaarxiv icon

Approximate Nearest Neighbour Phrase Mining for Contextual Speech Recognition

Add code
Bookmark button
Alert button
Apr 18, 2023
Maurits Bleeker, Pawel Swietojanski, Stefan Braun, Xiaodan Zhuang

Figure 1 for Approximate Nearest Neighbour Phrase Mining for Contextual Speech Recognition
Figure 2 for Approximate Nearest Neighbour Phrase Mining for Contextual Speech Recognition
Figure 3 for Approximate Nearest Neighbour Phrase Mining for Contextual Speech Recognition
Figure 4 for Approximate Nearest Neighbour Phrase Mining for Contextual Speech Recognition
Viaarxiv icon

Chain-based Discriminative Autoencoders for Speech Recognition

Add code
Bookmark button
Alert button
Mar 28, 2022
Hung-Shin Lee, Pin-Tuan Huang, Yao-Fei Cheng, Hsin-Min Wang

Figure 1 for Chain-based Discriminative Autoencoders for Speech Recognition
Figure 2 for Chain-based Discriminative Autoencoders for Speech Recognition
Figure 3 for Chain-based Discriminative Autoencoders for Speech Recognition
Viaarxiv icon

Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data

May 25, 2023
Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takanori Ashihara, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura, Atsunori Ogawa, Taichi Asami

Figure 1 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 2 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 3 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 4 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Viaarxiv icon

Svarah: Evaluating English ASR Systems on Indian Accents

May 25, 2023
Tahir Javed, Sakshi Joshi, Vignesh Nagarajan, Sai Sundaresan, Janki Nawale, Abhigyan Raman, Kaushal Bhogale, Pratyush Kumar, Mitesh M. Khapra

Figure 1 for Svarah: Evaluating English ASR Systems on Indian Accents
Figure 2 for Svarah: Evaluating English ASR Systems on Indian Accents
Figure 3 for Svarah: Evaluating English ASR Systems on Indian Accents
Figure 4 for Svarah: Evaluating English ASR Systems on Indian Accents
Viaarxiv icon