Alert button

"speech": models, code, and papers
Alert button

End-to-End Speech Recognition with High-Frame-Rate Features Extraction

Jul 12, 2019
Cong-Thanh Do

Figure 1 for End-to-End Speech Recognition with High-Frame-Rate Features Extraction
Figure 2 for End-to-End Speech Recognition with High-Frame-Rate Features Extraction
Figure 3 for End-to-End Speech Recognition with High-Frame-Rate Features Extraction
Figure 4 for End-to-End Speech Recognition with High-Frame-Rate Features Extraction
Viaarxiv icon

Accessibility and Trajectory-Based Text Characterization

Jan 17, 2022
Bárbara C. e Souza, Filipi N. Silva, Henrique F. de Arruda, Luciano da F. Costa, Diego R. Amancio

Figure 1 for Accessibility and Trajectory-Based Text Characterization
Figure 2 for Accessibility and Trajectory-Based Text Characterization
Figure 3 for Accessibility and Trajectory-Based Text Characterization
Figure 4 for Accessibility and Trajectory-Based Text Characterization
Viaarxiv icon

Generative adversarial network-based glottal waveform model for statistical parametric speech synthesis

Mar 14, 2019
Bajibabu Bollepalli, Lauri Juvela, Paavo Alku

Figure 1 for Generative adversarial network-based glottal waveform model for statistical parametric speech synthesis
Figure 2 for Generative adversarial network-based glottal waveform model for statistical parametric speech synthesis
Figure 3 for Generative adversarial network-based glottal waveform model for statistical parametric speech synthesis
Figure 4 for Generative adversarial network-based glottal waveform model for statistical parametric speech synthesis
Viaarxiv icon

DeepStroke: An Efficient Stroke Screening Framework for Emergency Rooms with Multimodal Adversarial Deep Learning

Sep 24, 2021
Tongan Cai, Haomiao Ni, Mingli Yu, Xiaolei Huang, Kelvin Wong, John Volpi, James Z. Wang, Stephen T. C. Wong

Figure 1 for DeepStroke: An Efficient Stroke Screening Framework for Emergency Rooms with Multimodal Adversarial Deep Learning
Figure 2 for DeepStroke: An Efficient Stroke Screening Framework for Emergency Rooms with Multimodal Adversarial Deep Learning
Figure 3 for DeepStroke: An Efficient Stroke Screening Framework for Emergency Rooms with Multimodal Adversarial Deep Learning
Figure 4 for DeepStroke: An Efficient Stroke Screening Framework for Emergency Rooms with Multimodal Adversarial Deep Learning
Viaarxiv icon

Real to H-space Encoder for Speech Recognition

Jun 17, 2019
Titouan Parcollet, Mohamed Morchid, Georges Linarès, Renato De Mori

Figure 1 for Real to H-space Encoder for Speech Recognition
Figure 2 for Real to H-space Encoder for Speech Recognition
Figure 3 for Real to H-space Encoder for Speech Recognition
Figure 4 for Real to H-space Encoder for Speech Recognition
Viaarxiv icon

Multimodal Speech Emotion Recognition and Ambiguity Resolution

Apr 12, 2019
Gaurav Sahu

Figure 1 for Multimodal Speech Emotion Recognition and Ambiguity Resolution
Figure 2 for Multimodal Speech Emotion Recognition and Ambiguity Resolution
Figure 3 for Multimodal Speech Emotion Recognition and Ambiguity Resolution
Figure 4 for Multimodal Speech Emotion Recognition and Ambiguity Resolution
Viaarxiv icon

Triplet loss based embeddings for forensic speaker identification in Spanish

Feb 24, 2021
Emmanuel Maqueda, Javier Alvarez-Jimenez, Carlos Mena, Ivan Meza

Figure 1 for Triplet loss based embeddings for forensic speaker identification in Spanish
Figure 2 for Triplet loss based embeddings for forensic speaker identification in Spanish
Figure 3 for Triplet loss based embeddings for forensic speaker identification in Spanish
Figure 4 for Triplet loss based embeddings for forensic speaker identification in Spanish
Viaarxiv icon

Exploiting semi-supervised training through a dropout regularization in end-to-end speech recognition

Aug 08, 2019
Subhadeep Dey, Petr Motlicek, Trung Bui, Franck Dernoncourt

Figure 1 for Exploiting semi-supervised training through a dropout regularization in end-to-end speech recognition
Figure 2 for Exploiting semi-supervised training through a dropout regularization in end-to-end speech recognition
Figure 3 for Exploiting semi-supervised training through a dropout regularization in end-to-end speech recognition
Figure 4 for Exploiting semi-supervised training through a dropout regularization in end-to-end speech recognition
Viaarxiv icon

Information Retrieval for ZeroSpeech 2021: The Submission by University of Wroclaw

Jun 22, 2021
Jan Chorowski, Grzegorz Ciesielski, Jarosław Dzikowski, Adrian Łańcucki, Ricard Marxer, Mateusz Opala, Piotr Pusz, Paweł Rychlikowski, Michał Stypułkowski

Figure 1 for Information Retrieval for ZeroSpeech 2021: The Submission by University of Wroclaw
Figure 2 for Information Retrieval for ZeroSpeech 2021: The Submission by University of Wroclaw
Figure 3 for Information Retrieval for ZeroSpeech 2021: The Submission by University of Wroclaw
Figure 4 for Information Retrieval for ZeroSpeech 2021: The Submission by University of Wroclaw
Viaarxiv icon

Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems

Dec 16, 2021
Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff

Figure 1 for Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems
Figure 2 for Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems
Figure 3 for Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems
Figure 4 for Domain Prompts: Towards memory and compute efficient domain adaptation of ASR systems
Viaarxiv icon