Alert button

"speech": models, code, and papers
Alert button

Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks

Oct 25, 2019
Alexandros Kastanos, Anton Ragni, Mark Gales

Figure 1 for Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks
Figure 2 for Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks
Figure 3 for Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks
Figure 4 for Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks
Viaarxiv icon

Evaluating the COVID-19 Identification ResNet (CIdeR) on the INTERSPEECH COVID-19 from Audio Challenges

Jul 30, 2021
Alican Akman, Harry Coppock, Alexander Gaskell, Panagiotis Tzirakis, Lyn Jones, Björn W. Schuller

Figure 1 for Evaluating the COVID-19 Identification ResNet (CIdeR) on the INTERSPEECH COVID-19 from Audio Challenges
Figure 2 for Evaluating the COVID-19 Identification ResNet (CIdeR) on the INTERSPEECH COVID-19 from Audio Challenges
Figure 3 for Evaluating the COVID-19 Identification ResNet (CIdeR) on the INTERSPEECH COVID-19 from Audio Challenges
Figure 4 for Evaluating the COVID-19 Identification ResNet (CIdeR) on the INTERSPEECH COVID-19 from Audio Challenges
Viaarxiv icon

What BERT Based Language Models Learn in Spoken Transcripts: An Empirical Study

Sep 21, 2021
Ayush Kumar, Mukuntha Narayanan Sundararaman, Jithendra Vepa

Figure 1 for What BERT Based Language Models Learn in Spoken Transcripts: An Empirical Study
Figure 2 for What BERT Based Language Models Learn in Spoken Transcripts: An Empirical Study
Figure 3 for What BERT Based Language Models Learn in Spoken Transcripts: An Empirical Study
Figure 4 for What BERT Based Language Models Learn in Spoken Transcripts: An Empirical Study
Viaarxiv icon

An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models

Sep 14, 2019
Khe Chai Sim, Petr Zadrazil, Françoise Beaufays

Figure 1 for An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models
Figure 2 for An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models
Figure 3 for An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models
Figure 4 for An Investigation Into On-device Personalization of End-to-end Automatic Speech Recognition Models
Viaarxiv icon

Two-Pass End-to-End Speech Recognition

Aug 29, 2019
Tara N. Sainath, Ruoming Pang, David Rybach, Yanzhang He, Rohit Prabhavalkar, Wei Li, Mirkó Visontai, Qiao Liang, Trevor Strohman, Yonghui Wu, Ian McGraw, Chung-Cheng Chiu

Figure 1 for Two-Pass End-to-End Speech Recognition
Figure 2 for Two-Pass End-to-End Speech Recognition
Figure 3 for Two-Pass End-to-End Speech Recognition
Figure 4 for Two-Pass End-to-End Speech Recognition
Viaarxiv icon

iRNN: Integer-only Recurrent Neural Network

Sep 20, 2021
Eyyüb Sari, Vanessa Courville, Vahid Partovi Nia

Figure 1 for iRNN: Integer-only Recurrent Neural Network
Figure 2 for iRNN: Integer-only Recurrent Neural Network
Figure 3 for iRNN: Integer-only Recurrent Neural Network
Figure 4 for iRNN: Integer-only Recurrent Neural Network
Viaarxiv icon

Hierarchical Generative Modeling for Controllable Speech Synthesis

Oct 16, 2018
Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, Ruoming Pang

Figure 1 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Figure 2 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Figure 3 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Figure 4 for Hierarchical Generative Modeling for Controllable Speech Synthesis
Viaarxiv icon

Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs

Apr 07, 2021
Sujeong Cha, Wangrui Hou, Hyun Jung, My Phung, Michael Picheny, Hong-Kwang Kuo, Samuel Thomas, Edmilson Morais

Figure 1 for Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
Figure 2 for Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
Figure 3 for Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
Figure 4 for Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs
Viaarxiv icon

Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks

Apr 16, 2019
Ryan Eloff, André Nortje, Benjamin van Niekerk, Avashna Govender, Leanne Nortje, Arnu Pretorius, Elan van Biljon, Ewald van der Westhuizen, Lisa van Staden, Herman Kamper

Figure 1 for Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks
Figure 2 for Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks
Figure 3 for Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks
Viaarxiv icon

HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection

Feb 02, 2022
Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov

Figure 1 for HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Figure 2 for HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Figure 3 for HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Figure 4 for HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection
Viaarxiv icon