Alert button

"speech": models, code, and papers
Alert button

Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale

Jun 23, 2023
Matthew Le, Apoorv Vyas, Bowen Shi, Brian Karrer, Leda Sari, Rashel Moritz, Mary Williamson, Vimal Manohar, Yossi Adi, Jay Mahadeokar, Wei-Ning Hsu

Figure 1 for Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Figure 2 for Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Figure 3 for Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Figure 4 for Voicebox: Text-Guided Multilingual Universal Speech Generation at Scale
Viaarxiv icon

Big model only for hard audios: Sample dependent Whisper model selection for efficient inferences

Sep 22, 2023
Hugo Malard, Salah Zaiem, Robin Algayres

Figure 1 for Big model only for hard audios: Sample dependent Whisper model selection for efficient inferences
Figure 2 for Big model only for hard audios: Sample dependent Whisper model selection for efficient inferences
Figure 3 for Big model only for hard audios: Sample dependent Whisper model selection for efficient inferences
Figure 4 for Big model only for hard audios: Sample dependent Whisper model selection for efficient inferences
Viaarxiv icon

Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages

Jun 13, 2023
Claytone Sikasote, Kalinda Siaminwe, Stanly Mwape, Bangiwe Zulu, Mofya Phiri, Martin Phiri, David Zulu, Mayumbo Nyirenda, Antonios Anastasopoulos

Figure 1 for Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages
Figure 2 for Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages
Figure 3 for Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages
Figure 4 for Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages
Viaarxiv icon

Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion

Sep 05, 2023
Wen-Chin Huang, Tomoki Toda

Figure 1 for Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion
Figure 2 for Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion
Figure 3 for Evaluating Methods for Ground-Truth-Free Foreign Accent Conversion
Viaarxiv icon

Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion

Jun 11, 2023
Yung-Lun Chien, Hsin-Hao Chen, Ming-Chi Yen, Shu-Wei Tsai, Hsin-Min Wang, Yu Tsao, Tai-Shih Chi

Figure 1 for Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion
Figure 2 for Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion
Figure 3 for Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion
Figure 4 for Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion
Viaarxiv icon

A small vocabulary database of ultrasound image sequences of vocal tract dynamics

Aug 26, 2023
Margareth Castillo, Felipe Rubio, Dagoberto Porras, Sonia H. Contreras-Ortiz, Alexander Sepúlveda

Viaarxiv icon

Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters

Jul 02, 2023
Anshu Bhatia, Sanchit Sinha, Saket Dingliwal, Karthik Gopalakrishnan, Sravan Bodapati, Katrin Kirchhoff

Figure 1 for Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Figure 2 for Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Figure 3 for Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Figure 4 for Don't Stop Self-Supervision: Accent Adaptation of Speech Representations via Residual Adapters
Viaarxiv icon

MFCCGAN: A Novel MFCC-Based Speech Synthesizer Using Adversarial Learning

Jun 22, 2023
Mohammad Reza Hasanabadi Majid Behdad Davood Gharavian

Viaarxiv icon

SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts

Jun 19, 2023
Haibin Wu, Kai-Wei Chang, Yuan-Kuei Wu, Hung-yi Lee

Figure 1 for SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts
Figure 2 for SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts
Figure 3 for SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts
Figure 4 for SpeechGen: Unlocking the Generative Power of Speech Language Models with Prompts
Viaarxiv icon

Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition

May 25, 2023
Wangyou Zhang, Yanmin Qian

Figure 1 for Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition
Figure 2 for Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition
Figure 3 for Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition
Figure 4 for Weakly-Supervised Speech Pre-training: A Case Study on Target Speech Recognition
Viaarxiv icon