Alert button

"speech": models, code, and papers
Alert button

Large Raw Emotional Dataset with Aggregation Mechanism

Add code
Bookmark button
Alert button
Dec 23, 2022
Vladimir Kondratenko, Artem Sokolov, Nikolay Karpov, Oleg Kutuzov, Nikita Savushkin, Fyodor Minkin

Figure 1 for Large Raw Emotional Dataset with Aggregation Mechanism
Figure 2 for Large Raw Emotional Dataset with Aggregation Mechanism
Figure 3 for Large Raw Emotional Dataset with Aggregation Mechanism
Figure 4 for Large Raw Emotional Dataset with Aggregation Mechanism
Viaarxiv icon

Vocal effort modeling in neural TTS for improving the intelligibility of synthetic speech in noise

Add code
Bookmark button
Alert button
Mar 29, 2022
Tuomo Raitio, Petko Petkov, Jiangchuan Li, Muhammed Shifas, Andrea Davis, Yannis Stylianou

Figure 1 for Vocal effort modeling in neural TTS for improving the intelligibility of synthetic speech in noise
Figure 2 for Vocal effort modeling in neural TTS for improving the intelligibility of synthetic speech in noise
Figure 3 for Vocal effort modeling in neural TTS for improving the intelligibility of synthetic speech in noise
Figure 4 for Vocal effort modeling in neural TTS for improving the intelligibility of synthetic speech in noise
Viaarxiv icon

Multi-Channel Speech Denoising for Machine Ears

Feb 17, 2022
Cong Han, E. Merve Kaya, Kyle Hoefer, Malcolm Slaney, Simon Carlile

Figure 1 for Multi-Channel Speech Denoising for Machine Ears
Figure 2 for Multi-Channel Speech Denoising for Machine Ears
Figure 3 for Multi-Channel Speech Denoising for Machine Ears
Figure 4 for Multi-Channel Speech Denoising for Machine Ears
Viaarxiv icon

Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes

May 18, 2022
Antonis Maronikolakis, Philip Baader, Hinrich Schütze

Figure 1 for Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes
Figure 2 for Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes
Figure 3 for Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes
Figure 4 for Analyzing Hate Speech Data along Racial, Gender and Intersectional Axes
Viaarxiv icon

Can Self-Supervised Learning solve the problem of child speech recognition?

Add code
Bookmark button
Alert button
Apr 06, 2022
Rishabh Jain, Mariam Yiwere, Dan Bigioi, Peter Corcoran

Figure 1 for Can Self-Supervised Learning solve the problem of child speech recognition?
Figure 2 for Can Self-Supervised Learning solve the problem of child speech recognition?
Figure 3 for Can Self-Supervised Learning solve the problem of child speech recognition?
Viaarxiv icon

Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments

Add code
Bookmark button
Alert button
Mar 22, 2022
Antonis Maronikolakis, Axel Wisiorek, Leah Nann, Haris Jabbar, Sahana Udupa, Hinrich Schuetze

Figure 1 for Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments
Figure 2 for Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments
Figure 3 for Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments
Figure 4 for Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments
Viaarxiv icon

Improving the Robustness of DistilHuBERT to Unseen Noisy Conditions via Data Augmentation, Curriculum Learning, and Multi-Task Enhancement

Nov 12, 2022
Heitor R. Guimarães, Arthur Pimentel, Anderson R. Avila, Mehdi Rezagholizadeh, Tiago H. Falk

Figure 1 for Improving the Robustness of DistilHuBERT to Unseen Noisy Conditions via Data Augmentation, Curriculum Learning, and Multi-Task Enhancement
Viaarxiv icon

Distinguishable Speaker Anonymization based on Formant and Fundamental Frequency Scaling

Add code
Bookmark button
Alert button
Nov 06, 2022
Jixun Yao, Qing Wang, Yi Lei, Pengcheng Guo, Lei Xie, Namin Wang, Jie Liu

Figure 1 for Distinguishable Speaker Anonymization based on Formant and Fundamental Frequency Scaling
Figure 2 for Distinguishable Speaker Anonymization based on Formant and Fundamental Frequency Scaling
Figure 3 for Distinguishable Speaker Anonymization based on Formant and Fundamental Frequency Scaling
Figure 4 for Distinguishable Speaker Anonymization based on Formant and Fundamental Frequency Scaling
Viaarxiv icon

Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data

Add code
Bookmark button
Alert button
May 30, 2022
Sungwon Kim, Heeseung Kim, Sungroh Yoon

Figure 1 for Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Figure 2 for Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Figure 3 for Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Figure 4 for Guided-TTS 2: A Diffusion Model for High-quality Adaptive Text-to-Speech with Untranscribed Data
Viaarxiv icon

STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent

Add code
Bookmark button
Alert button
Mar 28, 2022
Yuki Saito, Yuto Nishimura, Shinnosuke Takamichi, Kentaro Tachibana, Hiroshi Saruwatari

Figure 1 for STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Figure 2 for STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Figure 3 for STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Figure 4 for STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent
Viaarxiv icon