Alert button

"speech": models, code, and papers
Alert button

Speech frame implementation for speech analysis and recognition

Dec 15, 2021
A. A. Konev, V. S. Khlebnikov, A. Yu. Yakimuk

Figure 1 for Speech frame implementation for speech analysis and recognition
Figure 2 for Speech frame implementation for speech analysis and recognition
Figure 3 for Speech frame implementation for speech analysis and recognition
Figure 4 for Speech frame implementation for speech analysis and recognition
Viaarxiv icon

Revisiting End-to-End Speech-to-Text Translation From Scratch

Add code
Bookmark button
Alert button
Jun 09, 2022
Biao Zhang, Barry Haddow, Rico Sennrich

Figure 1 for Revisiting End-to-End Speech-to-Text Translation From Scratch
Figure 2 for Revisiting End-to-End Speech-to-Text Translation From Scratch
Figure 3 for Revisiting End-to-End Speech-to-Text Translation From Scratch
Figure 4 for Revisiting End-to-End Speech-to-Text Translation From Scratch
Viaarxiv icon

DAMO-NLP at NLPCC-2022 Task 2: Knowledge Enhanced Robust NER for Speech Entity Linking

Sep 29, 2022
Shen Huang, Yuchen Zhai, Xinwei Long, Yong Jiang, Xiaobin Wang, Yin Zhang, Pengjun Xie

Viaarxiv icon

SVTS: Scalable Video-to-Speech Synthesis

Add code
Bookmark button
Alert button
May 04, 2022
Rodrigo Mira, Alexandros Haliassos, Stavros Petridis, Björn W. Schuller, Maja Pantic

Figure 1 for SVTS: Scalable Video-to-Speech Synthesis
Figure 2 for SVTS: Scalable Video-to-Speech Synthesis
Figure 3 for SVTS: Scalable Video-to-Speech Synthesis
Figure 4 for SVTS: Scalable Video-to-Speech Synthesis
Viaarxiv icon

Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus

Add code
Bookmark button
Alert button
Mar 29, 2022
Minchan Kim, Myeonghun Jeong, Byoung Jin Choi, Sunghwan Ahn, Joun Yeop Lee, Nam Soo Kim

Figure 1 for Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus
Figure 2 for Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus
Figure 3 for Transfer Learning Framework for Low-Resource Text-to-Speech using a Large-Scale Unlabeled Speech Corpus
Viaarxiv icon

E-Branchformer: Branchformer with Enhanced merging for speech recognition

Add code
Bookmark button
Alert button
Sep 30, 2022
Kwangyoun Kim, Felix Wu, Yifan Peng, Jing Pan, Prashant Sridhar, Kyu J. Han, Shinji Watanabe

Figure 1 for E-Branchformer: Branchformer with Enhanced merging for speech recognition
Figure 2 for E-Branchformer: Branchformer with Enhanced merging for speech recognition
Figure 3 for E-Branchformer: Branchformer with Enhanced merging for speech recognition
Figure 4 for E-Branchformer: Branchformer with Enhanced merging for speech recognition
Viaarxiv icon

BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus

Add code
Bookmark button
Alert button
Jul 07, 2022
Josh Meyer, David Ifeoluwa Adelani, Edresson Casanova, Alp Öktem, Daniel Whitenack Julian Weber, Salomon Kabongo, Elizabeth Salesky, Iroro Orife, Colin Leong, Perez Ogayo, Chris Emezue, Jonathan Mukiibi, Salomey Osei, Apelete Agbolo, Victor Akinode, Bernard Opoku, Samuel Olanrewaju, Jesujoba Alabi, Shamsuddeen Muhammad

Figure 1 for BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus
Figure 2 for BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus
Figure 3 for BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus
Figure 4 for BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus
Viaarxiv icon

Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners

Apr 08, 2022
Zehai Tu, Ning Ma, Jon Barker

Figure 1 for Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners
Figure 2 for Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners
Figure 3 for Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners
Figure 4 for Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners
Viaarxiv icon

StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis

Add code
Bookmark button
Alert button
May 30, 2022
Yinghao Aaron Li, Cong Han, Nima Mesgarani

Figure 1 for StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Figure 2 for StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Figure 3 for StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Figure 4 for StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis
Viaarxiv icon

Streaming, fast and accurate on-device Inverse Text Normalization for Automatic Speech Recognition

Nov 07, 2022
Yashesh Gaur, Nick Kibre, Jian Xue, Kangyuan Shu, Yuhui Wang, Issac Alphanso, Jinyu Li, Yifan Gong

Figure 1 for Streaming, fast and accurate on-device Inverse Text Normalization for Automatic Speech Recognition
Figure 2 for Streaming, fast and accurate on-device Inverse Text Normalization for Automatic Speech Recognition
Figure 3 for Streaming, fast and accurate on-device Inverse Text Normalization for Automatic Speech Recognition
Figure 4 for Streaming, fast and accurate on-device Inverse Text Normalization for Automatic Speech Recognition
Viaarxiv icon