Alert button

"speech": models, code, and papers
Alert button

Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings

Add code
Bookmark button
Alert button
Jun 08, 2021
Marcely Zanon Boito, Bolaji Yusuf, Lucas Ondel, Aline Villavicencio, Laurent Besacier

Figure 1 for Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings
Figure 2 for Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings
Figure 3 for Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings
Figure 4 for Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings
Viaarxiv icon

Deep Learning Based Assessment of Synthetic Speech Naturalness

Add code
Bookmark button
Alert button
Apr 23, 2021
Gabriel Mittag, Sebastian Möller

Figure 1 for Deep Learning Based Assessment of Synthetic Speech Naturalness
Figure 2 for Deep Learning Based Assessment of Synthetic Speech Naturalness
Figure 3 for Deep Learning Based Assessment of Synthetic Speech Naturalness
Figure 4 for Deep Learning Based Assessment of Synthetic Speech Naturalness
Viaarxiv icon

GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis

Add code
Bookmark button
Alert button
Jun 29, 2021
Jinhyeok Yang, Jae-Sung Bae, Taejun Bak, Youngik Kim, Hoon-Young Cho

Figure 1 for GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis
Figure 2 for GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis
Figure 3 for GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis
Figure 4 for GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis
Viaarxiv icon

Egocentric Audio-Visual Noise Suppression

Nov 07, 2022
Roshan Sharma, Weipeng He, Ju Lin, Egor Lakomkin, Yang Liu, Kaustubh Kalgaonkar

Figure 1 for Egocentric Audio-Visual Noise Suppression
Figure 2 for Egocentric Audio-Visual Noise Suppression
Figure 3 for Egocentric Audio-Visual Noise Suppression
Figure 4 for Egocentric Audio-Visual Noise Suppression
Viaarxiv icon

Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments

Nov 07, 2022
Abhinav Joshi, Naman Gupta, Jinang Shah, Binod Bhattarai, Ashutosh Modi, Danail Stoyanov

Figure 1 for Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments
Figure 2 for Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments
Figure 3 for Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments
Figure 4 for Generalized Product-of-Experts for Learning Multimodal Representations in Noisy Environments
Viaarxiv icon

Speech Resynthesis from Discrete Disentangled Self-Supervised Representations

Add code
Bookmark button
Alert button
Apr 02, 2021
Adam Polyak, Yossi Adi, Jade Copet, Eugene Kharitonov, Kushal Lakhotia, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux

Figure 1 for Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
Figure 2 for Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
Figure 3 for Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
Figure 4 for Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
Viaarxiv icon

SEMOUR: A Scripted Emotional Speech Repository for Urdu

May 19, 2021
Nimra Zaheer, Obaid Ullah Ahmad, Ammar Ahmed, Muhammad Shehryar Khan, Mudassir Shabbir

Figure 1 for SEMOUR: A Scripted Emotional Speech Repository for Urdu
Figure 2 for SEMOUR: A Scripted Emotional Speech Repository for Urdu
Figure 3 for SEMOUR: A Scripted Emotional Speech Repository for Urdu
Figure 4 for SEMOUR: A Scripted Emotional Speech Repository for Urdu
Viaarxiv icon

TransPOS: Transformers for Consolidating Different POS Tagset Datasets

Add code
Bookmark button
Alert button
Sep 24, 2022
Alex Li, Ilyas Bankole-Hameed, Ranadeep Singh, Gabriel Shen Han Ng, Akshat Gupta

Figure 1 for TransPOS: Transformers for Consolidating Different POS Tagset Datasets
Figure 2 for TransPOS: Transformers for Consolidating Different POS Tagset Datasets
Figure 3 for TransPOS: Transformers for Consolidating Different POS Tagset Datasets
Figure 4 for TransPOS: Transformers for Consolidating Different POS Tagset Datasets
Viaarxiv icon

A Review on Part-of-Speech Technologies

Oct 11, 2021
Onyenwe Ikechukwu, Onyedikachukwu Ikechukwu-Onyenwe, Onyedinma Ebele

Figure 1 for A Review on Part-of-Speech Technologies
Figure 2 for A Review on Part-of-Speech Technologies
Figure 3 for A Review on Part-of-Speech Technologies
Viaarxiv icon

Predicting speech intelligibility from EEG using a dilated convolutional network

May 19, 2021
Bernd Accou, Mohammad Jalilpour Monesi, Hugo Van hamme, Tom Francart

Figure 1 for Predicting speech intelligibility from EEG using a dilated convolutional network
Figure 2 for Predicting speech intelligibility from EEG using a dilated convolutional network
Figure 3 for Predicting speech intelligibility from EEG using a dilated convolutional network
Figure 4 for Predicting speech intelligibility from EEG using a dilated convolutional network
Viaarxiv icon