Alert button

"speech recognition": models, code, and papers
Alert button

A Multimodal Dynamical Variational Autoencoder for Audiovisual Speech Representation Learning

Add code
Bookmark button
Alert button
May 05, 2023
Samir Sadok, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda, Renaud Séguier

Viaarxiv icon

Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit

Mar 23, 2023
Haoyu Tang, Zhaoyi Liu, Chang Zeng, Xinfeng Li

Figure 1 for Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit
Figure 2 for Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit
Figure 3 for Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit
Figure 4 for Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit
Viaarxiv icon

Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada

Jul 27, 2022
Madhavaraj A, Bharathi Pilar, Ramakrishnan A G

Figure 1 for Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada
Figure 2 for Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada
Figure 3 for Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada
Figure 4 for Knowledge-driven Subword Grammar Modeling for Automatic Speech Recognition in Tamil and Kannada
Viaarxiv icon

UML: A Universal Monolingual Output Layer for Multilingual ASR

Feb 22, 2023
Chao Zhang, Bo Li, Tara N. Sainath, Trevor Strohman, Shuo-yiin Chang

Figure 1 for UML: A Universal Monolingual Output Layer for Multilingual ASR
Figure 2 for UML: A Universal Monolingual Output Layer for Multilingual ASR
Figure 3 for UML: A Universal Monolingual Output Layer for Multilingual ASR
Viaarxiv icon

Multitask Learning for Grapheme-to-Phoneme Conversion of Anglicisms in German Speech Recognition

May 26, 2021
Julia Pritzen, Michael Gref, Christoph Schmidt, Dietlind Zühlke

Figure 1 for Multitask Learning for Grapheme-to-Phoneme Conversion of Anglicisms in German Speech Recognition
Figure 2 for Multitask Learning for Grapheme-to-Phoneme Conversion of Anglicisms in German Speech Recognition
Figure 3 for Multitask Learning for Grapheme-to-Phoneme Conversion of Anglicisms in German Speech Recognition
Figure 4 for Multitask Learning for Grapheme-to-Phoneme Conversion of Anglicisms in German Speech Recognition
Viaarxiv icon

Efficient Ensemble Architecture for Multimodal Acoustic and Textual Embeddings in Punctuation Restoration using Time-Delay Neural Networks

Feb 26, 2023
Xing Yi Liu, Homayoon Beigi

Figure 1 for Efficient Ensemble Architecture for Multimodal Acoustic and Textual Embeddings in Punctuation Restoration using Time-Delay Neural Networks
Figure 2 for Efficient Ensemble Architecture for Multimodal Acoustic and Textual Embeddings in Punctuation Restoration using Time-Delay Neural Networks
Figure 3 for Efficient Ensemble Architecture for Multimodal Acoustic and Textual Embeddings in Punctuation Restoration using Time-Delay Neural Networks
Figure 4 for Efficient Ensemble Architecture for Multimodal Acoustic and Textual Embeddings in Punctuation Restoration using Time-Delay Neural Networks
Viaarxiv icon

Streaming end-to-end speech recognition with jointly trained neural feature enhancement

May 04, 2021
Chanwoo Kim, Abhinav Garg, Dhananjaya Gowda, Seongkyu Mun, Changwoo Han

Figure 1 for Streaming end-to-end speech recognition with jointly trained neural feature enhancement
Figure 2 for Streaming end-to-end speech recognition with jointly trained neural feature enhancement
Viaarxiv icon

Speech Recognition with Augmented Synthesized Speech

Sep 25, 2019
Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro Moreno, Yonghui Wu, Zelin Wu

Figure 1 for Speech Recognition with Augmented Synthesized Speech
Figure 2 for Speech Recognition with Augmented Synthesized Speech
Figure 3 for Speech Recognition with Augmented Synthesized Speech
Figure 4 for Speech Recognition with Augmented Synthesized Speech
Viaarxiv icon

The NTNU System for Formosa Speech Recognition Challenge 2020

Add code
Bookmark button
Alert button
Apr 20, 2021
Fu-An Chao, Tien-Hong Lo, Shi-Yan Weng, Shih-Hsuan Chiu, Yao-Ting Sung, Berlin Chen

Figure 1 for The NTNU System for Formosa Speech Recognition Challenge 2020
Figure 2 for The NTNU System for Formosa Speech Recognition Challenge 2020
Figure 3 for The NTNU System for Formosa Speech Recognition Challenge 2020
Figure 4 for The NTNU System for Formosa Speech Recognition Challenge 2020
Viaarxiv icon

A comparison of streaming models and data augmentation methods for robust speech recognition

Nov 19, 2021
Jiyeon Kim, Mehul Kumar, Dhananjaya Gowda, Abhinav Garg, Chanwoo Kim

Figure 1 for A comparison of streaming models and data augmentation methods for robust speech recognition
Figure 2 for A comparison of streaming models and data augmentation methods for robust speech recognition
Figure 3 for A comparison of streaming models and data augmentation methods for robust speech recognition
Figure 4 for A comparison of streaming models and data augmentation methods for robust speech recognition
Viaarxiv icon