Alert button

"speech": models, code, and papers
Alert button

Investigating the Utility of Surprisal from Large Language Models for Speech Synthesis Prosody

Add code
Bookmark button
Alert button
Jun 16, 2023
Sofoklis Kakouros, Juraj Šimko, Martti Vainio, Antti Suni

Figure 1 for Investigating the Utility of Surprisal from Large Language Models for Speech Synthesis Prosody
Figure 2 for Investigating the Utility of Surprisal from Large Language Models for Speech Synthesis Prosody
Figure 3 for Investigating the Utility of Surprisal from Large Language Models for Speech Synthesis Prosody
Figure 4 for Investigating the Utility of Surprisal from Large Language Models for Speech Synthesis Prosody
Viaarxiv icon

Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?

Add code
Bookmark button
Alert button
Jun 01, 2023
Salah Zaiem, Youcef Kemiche, Titouan Parcollet, Slim Essid, Mirco Ravanelli

Figure 1 for Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?
Figure 2 for Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?
Figure 3 for Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?
Viaarxiv icon

RedPenNet for Grammatical Error Correction: Outputs to Tokens, Attentions to Spans

Add code
Bookmark button
Alert button
Sep 19, 2023
Bohdan Didenko, Andrii Sameliuk

Viaarxiv icon

Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition

Add code
Bookmark button
Alert button
May 19, 2023
Dima Rekesh, Samuel Kriman, Somshubra Majumdar, Vahid Noroozi, He Huang, Oleksii Hrinchuk, Ankur Kumar, Boris Ginsburg

Figure 1 for Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Figure 2 for Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Figure 3 for Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Figure 4 for Fast Conformer with Linearly Scalable Attention for Efficient Speech Recognition
Viaarxiv icon

Recovering implicit pitch contours from formants in whispered speech

Jul 06, 2023
Pablo Pérez Zarazaga, Zofia Malisz

Figure 1 for Recovering implicit pitch contours from formants in whispered speech
Figure 2 for Recovering implicit pitch contours from formants in whispered speech
Figure 3 for Recovering implicit pitch contours from formants in whispered speech
Figure 4 for Recovering implicit pitch contours from formants in whispered speech
Viaarxiv icon

Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer

Sep 14, 2023
Peng Wang, Yifan Yang, Zheng Liang, Tian Tan, Shiliang Zhang, Xie Chen

Viaarxiv icon

OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment

Add code
Bookmark button
Alert button
Jun 10, 2023
Xize Cheng, Tao Jin, Linjun Li, Wang Lin, Xinyu Duan, Zhou Zhao

Figure 1 for OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment
Figure 2 for OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment
Figure 3 for OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment
Figure 4 for OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment
Viaarxiv icon

Better speech synthesis through scaling

Add code
Bookmark button
Alert button
May 12, 2023
James Betker

Figure 1 for Better speech synthesis through scaling
Figure 2 for Better speech synthesis through scaling
Figure 3 for Better speech synthesis through scaling
Figure 4 for Better speech synthesis through scaling
Viaarxiv icon

Improving Robustness of Neural Inverse Text Normalization via Data-Augmentation, Semi-Supervised Learning, and Post-Aligning Method

Sep 12, 2023
Juntae Kim, Minkyu Lim, Seokjin Hong

Figure 1 for Improving Robustness of Neural Inverse Text Normalization via Data-Augmentation, Semi-Supervised Learning, and Post-Aligning Method
Figure 2 for Improving Robustness of Neural Inverse Text Normalization via Data-Augmentation, Semi-Supervised Learning, and Post-Aligning Method
Figure 3 for Improving Robustness of Neural Inverse Text Normalization via Data-Augmentation, Semi-Supervised Learning, and Post-Aligning Method
Viaarxiv icon

Considerations for Ethical Speech Recognition Datasets

May 03, 2023
Orestis Papakyriakopoulos, Alice Xiang

Viaarxiv icon