Alert button

"speech": models, code, and papers
Alert button

Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings

Add code
Bookmark button
Alert button
Jun 08, 2021
Marcely Zanon Boito, Bolaji Yusuf, Lucas Ondel, Aline Villavicencio, Laurent Besacier

Figure 1 for Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings
Figure 2 for Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings
Figure 3 for Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings
Figure 4 for Unsupervised Word Segmentation from Discrete Speech Units in Low-Resource Settings
Viaarxiv icon

SERAB: A multi-lingual benchmark for speech emotion recognition

Add code
Bookmark button
Alert button
Oct 07, 2021
Neil Scheidwasser-Clow, Mikolaj Kegler, Pierre Beckmann, Milos Cernak

Figure 1 for SERAB: A multi-lingual benchmark for speech emotion recognition
Figure 2 for SERAB: A multi-lingual benchmark for speech emotion recognition
Figure 3 for SERAB: A multi-lingual benchmark for speech emotion recognition
Figure 4 for SERAB: A multi-lingual benchmark for speech emotion recognition
Viaarxiv icon

Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development

Add code
Bookmark button
Alert button
Sep 01, 2021
Mingkuan Liu, Chi Zhang, Hua Xing, Chao Feng, Monchu Chen, Judith Bishop, Grace Ngapo

Figure 1 for Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development
Figure 2 for Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development
Figure 3 for Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development
Figure 4 for Scalable Data Annotation Pipeline for High-Quality Large Speech Datasets Development
Viaarxiv icon

Deep Learning Based Assessment of Synthetic Speech Naturalness

Add code
Bookmark button
Alert button
Apr 23, 2021
Gabriel Mittag, Sebastian Möller

Figure 1 for Deep Learning Based Assessment of Synthetic Speech Naturalness
Figure 2 for Deep Learning Based Assessment of Synthetic Speech Naturalness
Figure 3 for Deep Learning Based Assessment of Synthetic Speech Naturalness
Figure 4 for Deep Learning Based Assessment of Synthetic Speech Naturalness
Viaarxiv icon

TPARN: Triple-path Attentive Recurrent Network for Time-domain Multichannel Speech Enhancement

Add code
Bookmark button
Alert button
Oct 20, 2021
Ashutosh Pandey, Buye Xu, Anurag Kumar, Jacob Donley, Paul Calamia, DeLiang Wang

Figure 1 for TPARN: Triple-path Attentive Recurrent Network for Time-domain Multichannel Speech Enhancement
Figure 2 for TPARN: Triple-path Attentive Recurrent Network for Time-domain Multichannel Speech Enhancement
Figure 3 for TPARN: Triple-path Attentive Recurrent Network for Time-domain Multichannel Speech Enhancement
Figure 4 for TPARN: Triple-path Attentive Recurrent Network for Time-domain Multichannel Speech Enhancement
Viaarxiv icon

A Study on Speech Enhancement Based on Diffusion Probabilistic Model

Add code
Bookmark button
Alert button
Jul 25, 2021
Yen-Ju Lu, Yu Tsao, Shinji Watanabe

Figure 1 for A Study on Speech Enhancement Based on Diffusion Probabilistic Model
Figure 2 for A Study on Speech Enhancement Based on Diffusion Probabilistic Model
Figure 3 for A Study on Speech Enhancement Based on Diffusion Probabilistic Model
Figure 4 for A Study on Speech Enhancement Based on Diffusion Probabilistic Model
Viaarxiv icon

SEMOUR: A Scripted Emotional Speech Repository for Urdu

May 19, 2021
Nimra Zaheer, Obaid Ullah Ahmad, Ammar Ahmed, Muhammad Shehryar Khan, Mudassir Shabbir

Figure 1 for SEMOUR: A Scripted Emotional Speech Repository for Urdu
Figure 2 for SEMOUR: A Scripted Emotional Speech Repository for Urdu
Figure 3 for SEMOUR: A Scripted Emotional Speech Repository for Urdu
Figure 4 for SEMOUR: A Scripted Emotional Speech Repository for Urdu
Viaarxiv icon

Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition

Add code
Bookmark button
Alert button
Mar 26, 2022
Xichen Pan, Peiyu Chen, Yichen Gong, Helong Zhou, Xinbing Wang, Zhouhan Lin

Figure 1 for Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition
Figure 2 for Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition
Figure 3 for Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition
Figure 4 for Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition
Viaarxiv icon

LAE: Language-Aware Encoder for Monolingual and Multilingual ASR

Add code
Bookmark button
Alert button
Jun 05, 2022
Jinchuan Tian, Jianwei Yu, Chunlei Zhang, Chao Weng, Yuexian Zou, Dong Yu

Figure 1 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Figure 2 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Figure 3 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Figure 4 for LAE: Language-Aware Encoder for Monolingual and Multilingual ASR
Viaarxiv icon

Predicting speech intelligibility from EEG using a dilated convolutional network

May 19, 2021
Bernd Accou, Mohammad Jalilpour Monesi, Hugo Van hamme, Tom Francart

Figure 1 for Predicting speech intelligibility from EEG using a dilated convolutional network
Figure 2 for Predicting speech intelligibility from EEG using a dilated convolutional network
Figure 3 for Predicting speech intelligibility from EEG using a dilated convolutional network
Figure 4 for Predicting speech intelligibility from EEG using a dilated convolutional network
Viaarxiv icon