Alert button

"speech": models, code, and papers
Alert button

LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech

Add code
Bookmark button
Alert button
Oct 18, 2021
Wen-Chin Huang, Erica Cooper, Junichi Yamagishi, Tomoki Toda

Figure 1 for LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech
Figure 2 for LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech
Figure 3 for LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech
Viaarxiv icon

Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Data Using (Psycho-)Linguistic and Fluency Features

Nov 13, 2021
Yu Qiao, Sourabh Zanwar, Rishab Bhattacharyya, Daniel Wiechmann, Wei Zhou, Elma Kerz, Ralf Schlüter

Figure 1 for Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Data Using (Psycho-)Linguistic and Fluency Features
Figure 2 for Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Data Using (Psycho-)Linguistic and Fluency Features
Figure 3 for Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Data Using (Psycho-)Linguistic and Fluency Features
Figure 4 for Prediction of Listener Perception of Argumentative Speech in a Crowdsourced Data Using (Psycho-)Linguistic and Fluency Features
Viaarxiv icon

An Attribute-Aligned Strategy for Learning Speech Representation

Jun 05, 2021
Yu-Lin Huang, Bo-Hao Su, Y. -W. Peter Hong, Chi-Chun Lee

Figure 1 for An Attribute-Aligned Strategy for Learning Speech Representation
Figure 2 for An Attribute-Aligned Strategy for Learning Speech Representation
Figure 3 for An Attribute-Aligned Strategy for Learning Speech Representation
Viaarxiv icon

Prosody-TTS: An end-to-end speech synthesis system with prosody control

Add code
Bookmark button
Alert button
Oct 06, 2021
Giridhar Pamisetty, K. Sri Rama Murty

Figure 1 for Prosody-TTS: An end-to-end speech synthesis system with prosody control
Figure 2 for Prosody-TTS: An end-to-end speech synthesis system with prosody control
Figure 3 for Prosody-TTS: An end-to-end speech synthesis system with prosody control
Figure 4 for Prosody-TTS: An end-to-end speech synthesis system with prosody control
Viaarxiv icon

ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS

Add code
Bookmark button
Alert button
Sep 14, 2022
Liumeng Xue, Frank K. Soong, Shaofei Zhang, Lei Xie

Figure 1 for ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS
Figure 2 for ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS
Figure 3 for ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS
Figure 4 for ParaTTS: Learning Linguistic and Prosodic Cross-sentence Information in Paragraph-based TTS
Viaarxiv icon

Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR

Nov 22, 2021
Ondrej Klejch, Electra Wallington, Peter Bell

Figure 1 for Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR
Figure 2 for Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR
Figure 3 for Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR
Figure 4 for Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR
Viaarxiv icon

VSEGAN: Visual Speech Enhancement Generative Adversarial Network

Add code
Bookmark button
Alert button
Feb 04, 2021
Xinmeng Xu, Yang Wang, Dongxiang Xu, Yiyuan Peng, Cong Zhang, Jie Jia, Binbin Chen

Figure 1 for VSEGAN: Visual Speech Enhancement Generative Adversarial Network
Figure 2 for VSEGAN: Visual Speech Enhancement Generative Adversarial Network
Figure 3 for VSEGAN: Visual Speech Enhancement Generative Adversarial Network
Figure 4 for VSEGAN: Visual Speech Enhancement Generative Adversarial Network
Viaarxiv icon

Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training

Oct 20, 2021
Chenyang Gao, Yue Gu, Ivan Marsic

Figure 1 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Figure 2 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Figure 3 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Figure 4 for Time-Domain Mapping Based Single-Channel Speech Separation With Hierarchical Constraint Training
Viaarxiv icon

Toroidal Probabilistic Spherical Discriminant Analysis

Add code
Bookmark button
Alert button
Oct 27, 2022
Anna Silnova, Niko Brümmer, Albert Swart, Lukáš Burget

Figure 1 for Toroidal Probabilistic Spherical Discriminant Analysis
Figure 2 for Toroidal Probabilistic Spherical Discriminant Analysis
Viaarxiv icon

A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate

Add code
Bookmark button
Alert button
Aug 09, 2021
Ahmed Mustafa, Jan Büthe, Srikanth Korse, Kishan Gupta, Guillaume Fuchs, Nicola Pia

Figure 1 for A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate
Figure 2 for A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate
Figure 3 for A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate
Figure 4 for A Streamwise GAN Vocoder for Wideband Speech Coding at Very Low Bit Rate
Viaarxiv icon