Alert button

"speech": models, code, and papers
Alert button

Towards an AI to Win Ghana's National Science and Maths Quiz

Add code
Bookmark button
Alert button
Aug 08, 2023
George Boateng, Jonathan Abrefah Mensah, Kevin Takyi Yeboah, William Edor, Andrew Kojo Mensah-Onumah, Naafi Dasana Ibrahim, Nana Sam Yeboah

Figure 1 for Towards an AI to Win Ghana's National Science and Maths Quiz
Figure 2 for Towards an AI to Win Ghana's National Science and Maths Quiz
Figure 3 for Towards an AI to Win Ghana's National Science and Maths Quiz
Viaarxiv icon

Exploiting Time-Frequency Conformers for Music Audio Enhancement

Add code
Bookmark button
Alert button
Aug 24, 2023
Yunkee Chae, Junghyun Koo, Sungho Lee, Kyogu Lee

Figure 1 for Exploiting Time-Frequency Conformers for Music Audio Enhancement
Figure 2 for Exploiting Time-Frequency Conformers for Music Audio Enhancement
Figure 3 for Exploiting Time-Frequency Conformers for Music Audio Enhancement
Figure 4 for Exploiting Time-Frequency Conformers for Music Audio Enhancement
Viaarxiv icon

Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis

Add code
Bookmark button
Alert button
Apr 13, 2023
Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Shiyin Kang, Helen Meng

Figure 1 for Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Figure 2 for Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Figure 3 for Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Figure 4 for Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Viaarxiv icon

A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos

Add code
Bookmark button
Alert button
Jul 20, 2023
Anand Kumar Rai, Siddharth D Jaiswal, Animesh Mukherjee

Figure 1 for A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos
Figure 2 for A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos
Figure 3 for A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos
Figure 4 for A Deep Dive into the Disparity of Word Error Rates Across Thousands of NPTEL MOOC Videos
Viaarxiv icon

Parts of Speech-Grounded Subspaces in Vision-Language Models

Add code
Bookmark button
Alert button
May 23, 2023
James Oldfield, Christos Tzelepis, Yannis Panagakis, Mihalis A. Nicolaou, Ioannis Patras

Figure 1 for Parts of Speech-Grounded Subspaces in Vision-Language Models
Figure 2 for Parts of Speech-Grounded Subspaces in Vision-Language Models
Figure 3 for Parts of Speech-Grounded Subspaces in Vision-Language Models
Figure 4 for Parts of Speech-Grounded Subspaces in Vision-Language Models
Viaarxiv icon

Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak

Add code
Bookmark button
Alert button
Jun 07, 2023
Jan Lehečka, Josef V. Psutka, Josef Psutka

Figure 1 for Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak
Figure 2 for Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak
Figure 3 for Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak
Viaarxiv icon

On the Robustness of Arabic Speech Dialect Identification

Jun 01, 2023
Peter Sullivan, AbdelRahim Elmadany, Muhammad Abdul-Mageed

Figure 1 for On the Robustness of Arabic Speech Dialect Identification
Figure 2 for On the Robustness of Arabic Speech Dialect Identification
Figure 3 for On the Robustness of Arabic Speech Dialect Identification
Figure 4 for On the Robustness of Arabic Speech Dialect Identification
Viaarxiv icon

Predicting EEG Responses to Attended Speech via Deep Neural Networks for Speech

Feb 27, 2023
Emina Alickovic, Tobias Dorszewski, Thomas U. Christiansen, Kasper Eskelund, Leonardo Gizzi, Martin A. Skoglund, Dorothea Wendt

Figure 1 for Predicting EEG Responses to Attended Speech via Deep Neural Networks for Speech
Figure 2 for Predicting EEG Responses to Attended Speech via Deep Neural Networks for Speech
Figure 3 for Predicting EEG Responses to Attended Speech via Deep Neural Networks for Speech
Figure 4 for Predicting EEG Responses to Attended Speech via Deep Neural Networks for Speech
Viaarxiv icon

Textless Speech-to-Music Retrieval Using Emotion Similarity

Add code
Bookmark button
Alert button
Mar 19, 2023
SeungHeon Doh, Minz Won, Keunwoo Choi, Juhan Nam

Figure 1 for Textless Speech-to-Music Retrieval Using Emotion Similarity
Figure 2 for Textless Speech-to-Music Retrieval Using Emotion Similarity
Figure 3 for Textless Speech-to-Music Retrieval Using Emotion Similarity
Figure 4 for Textless Speech-to-Music Retrieval Using Emotion Similarity
Viaarxiv icon

ICASSP 2023 Speech Signal Improvement Challenge

Mar 12, 2023
Ross Cutler, Ando Saabas, Babak Naderi, Nicolae-Cătălin Ristea, Sebastian Braun, Solomiya Branets

Figure 1 for ICASSP 2023 Speech Signal Improvement Challenge
Figure 2 for ICASSP 2023 Speech Signal Improvement Challenge
Figure 3 for ICASSP 2023 Speech Signal Improvement Challenge
Figure 4 for ICASSP 2023 Speech Signal Improvement Challenge
Viaarxiv icon