Alert button

"speech": models, code, and papers
Alert button

Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook

Add code
Bookmark button
Alert button
Oct 24, 2022
Baihan Lin

Figure 1 for Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook
Figure 2 for Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook
Figure 3 for Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook
Figure 4 for Reinforcement Learning and Bandits for Speech and Language Processing: Tutorial, Review and Outlook
Viaarxiv icon

Adaptive Speech Quality Aware Complex Neural Network for Acoustic Echo Cancellation with Supervised Contrastive Learning

Nov 01, 2022
Bozhong Liu, Xiaoxi Yu, Hantao Huang

Figure 1 for Adaptive Speech Quality Aware Complex Neural Network for Acoustic Echo Cancellation with Supervised Contrastive Learning
Figure 2 for Adaptive Speech Quality Aware Complex Neural Network for Acoustic Echo Cancellation with Supervised Contrastive Learning
Figure 3 for Adaptive Speech Quality Aware Complex Neural Network for Acoustic Echo Cancellation with Supervised Contrastive Learning
Figure 4 for Adaptive Speech Quality Aware Complex Neural Network for Acoustic Echo Cancellation with Supervised Contrastive Learning
Viaarxiv icon

Hey ASR System! Why Aren't You More Inclusive? Automatic Speech Recognition Systems' Bias and Proposed Bias Mitigation Techniques. A Literature Review

Add code
Bookmark button
Alert button
Nov 17, 2022
Mikel K. Ngueajio, Gloria Washington

Viaarxiv icon

Phonemic Representation and Transcription for Speech to Text Applications for Under-resourced Indigenous African Languages: The Case of Kiswahili

Add code
Bookmark button
Alert button
Oct 29, 2022
Ebbie Awino, Lilian Wanzare, Lawrence Muchemi, Barack Wanjawa, Edward Ombui, Florence Indede, Owen McOnyango, Benard Okal

Figure 1 for Phonemic Representation and Transcription for Speech to Text Applications for Under-resourced Indigenous African Languages: The Case of Kiswahili
Figure 2 for Phonemic Representation and Transcription for Speech to Text Applications for Under-resourced Indigenous African Languages: The Case of Kiswahili
Figure 3 for Phonemic Representation and Transcription for Speech to Text Applications for Under-resourced Indigenous African Languages: The Case of Kiswahili
Figure 4 for Phonemic Representation and Transcription for Speech to Text Applications for Under-resourced Indigenous African Languages: The Case of Kiswahili
Viaarxiv icon

Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation

Dec 14, 2022
Yinhao Xu, Jian Zhou, Liang Tao, Hon Keung Kwan

Figure 1 for Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation
Figure 2 for Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation
Figure 3 for Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation
Figure 4 for Multi-Scale Feature Fusion Transformer Network for End-to-End Single Channel Speech Separation
Viaarxiv icon

Rhythmic Gesticulator: Rhythm-Aware Co-Speech Gesture Synthesis with Hierarchical Neural Embeddings

Add code
Bookmark button
Alert button
Oct 05, 2022
Tenglong Ao, Qingzhe Gao, Yuke Lou, Baoquan Chen, Libin Liu

Figure 1 for Rhythmic Gesticulator: Rhythm-Aware Co-Speech Gesture Synthesis with Hierarchical Neural Embeddings
Figure 2 for Rhythmic Gesticulator: Rhythm-Aware Co-Speech Gesture Synthesis with Hierarchical Neural Embeddings
Figure 3 for Rhythmic Gesticulator: Rhythm-Aware Co-Speech Gesture Synthesis with Hierarchical Neural Embeddings
Figure 4 for Rhythmic Gesticulator: Rhythm-Aware Co-Speech Gesture Synthesis with Hierarchical Neural Embeddings
Viaarxiv icon

Towards Building Text-To-Speech Systems for the Next Billion Users

Add code
Bookmark button
Alert button
Nov 17, 2022
Gokul Karthik Kumar, Praveen S V, Pratyush Kumar, Mitesh M. Khapra, Karthik Nandakumar

Figure 1 for Towards Building Text-To-Speech Systems for the Next Billion Users
Figure 2 for Towards Building Text-To-Speech Systems for the Next Billion Users
Figure 3 for Towards Building Text-To-Speech Systems for the Next Billion Users
Figure 4 for Towards Building Text-To-Speech Systems for the Next Billion Users
Viaarxiv icon

Contextual-Utterance Training for Automatic Speech Recognition

Oct 27, 2022
Alejandro Gomez-Alanis, Lukas Drude, Andreas Schwarz, Rupak Vignesh Swaminathan, Simon Wiesler

Figure 1 for Contextual-Utterance Training for Automatic Speech Recognition
Figure 2 for Contextual-Utterance Training for Automatic Speech Recognition
Figure 3 for Contextual-Utterance Training for Automatic Speech Recognition
Figure 4 for Contextual-Utterance Training for Automatic Speech Recognition
Viaarxiv icon

Comparative layer-wise analysis of self-supervised speech models

Add code
Bookmark button
Alert button
Nov 08, 2022
Ankita Pasad, Bowen Shi, Karen Livescu

Figure 1 for Comparative layer-wise analysis of self-supervised speech models
Figure 2 for Comparative layer-wise analysis of self-supervised speech models
Figure 3 for Comparative layer-wise analysis of self-supervised speech models
Figure 4 for Comparative layer-wise analysis of self-supervised speech models
Viaarxiv icon

A unified one-shot prosody and speaker conversion system with self-supervised discrete speech units

Add code
Bookmark button
Alert button
Nov 12, 2022
Li-Wei Chen, Shinji Watanabe, Alexander Rudnicky

Figure 1 for A unified one-shot prosody and speaker conversion system with self-supervised discrete speech units
Figure 2 for A unified one-shot prosody and speaker conversion system with self-supervised discrete speech units
Figure 3 for A unified one-shot prosody and speaker conversion system with self-supervised discrete speech units
Figure 4 for A unified one-shot prosody and speaker conversion system with self-supervised discrete speech units
Viaarxiv icon