Alert button

"speech": models, code, and papers
Alert button

Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis

Add code
Bookmark button
Alert button
Feb 28, 2020
Jennifer Williams, Joanna Rownicka, Pilar Oplustil, Simon King

Figure 1 for Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis
Figure 2 for Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis
Figure 3 for Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis
Figure 4 for Comparison of Speech Representations for Automatic Quality Estimation in Multi-Speaker Text-to-Speech Synthesis
Viaarxiv icon

FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech

Add code
Bookmark button
Alert button
Jun 08, 2020
Yi Ren, Chenxu Hu, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

Figure 1 for FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech
Figure 2 for FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech
Figure 3 for FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech
Figure 4 for FastSpeech 2: Fast and High-Quality End-to-End Text-to-Speech
Viaarxiv icon

Latency-Controlled Neural Architecture Search for Streaming Speech Recognition

May 08, 2021
Liqiang He, Shulin Feng, Dan Su, Dong Yu

Figure 1 for Latency-Controlled Neural Architecture Search for Streaming Speech Recognition
Figure 2 for Latency-Controlled Neural Architecture Search for Streaming Speech Recognition
Figure 3 for Latency-Controlled Neural Architecture Search for Streaming Speech Recognition
Figure 4 for Latency-Controlled Neural Architecture Search for Streaming Speech Recognition
Viaarxiv icon

Real time spectrogram inversion on mobile phone

Add code
Bookmark button
Alert button
Mar 10, 2022
Oleg Rybakov, Marco Tagliasacchi, Yunpeng Li, Liyang Jiang, Xia Zhang, Fadi Biadsy

Figure 1 for Real time spectrogram inversion on mobile phone
Figure 2 for Real time spectrogram inversion on mobile phone
Figure 3 for Real time spectrogram inversion on mobile phone
Figure 4 for Real time spectrogram inversion on mobile phone
Viaarxiv icon

DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors

Oct 28, 2020
Chandan K A Reddy, Vishak Gopal, Ross Cutler

Figure 1 for DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors
Figure 2 for DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors
Figure 3 for DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors
Figure 4 for DNSMOS: A Non-Intrusive Perceptual Objective Speech Quality metric to evaluate Noise Suppressors
Viaarxiv icon

Cross Lingual Cross Corpus Speech Emotion Recognition

Mar 18, 2020
Shivali Goel, Homayoon Beigi

Figure 1 for Cross Lingual Cross Corpus Speech Emotion Recognition
Figure 2 for Cross Lingual Cross Corpus Speech Emotion Recognition
Figure 3 for Cross Lingual Cross Corpus Speech Emotion Recognition
Figure 4 for Cross Lingual Cross Corpus Speech Emotion Recognition
Viaarxiv icon

Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers

Nov 26, 2019
Ya Zhao, Rui Xu, Xinchao Wang, Peng Hou, Haihong Tang, Mingli Song

Figure 1 for Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers
Figure 2 for Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers
Figure 3 for Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers
Figure 4 for Hearing Lips: Improving Lip Reading by Distilling Speech Recognizers
Viaarxiv icon

Semantic Characteristics of Schizophrenic Speech

Apr 16, 2019
Kfir Bar, Vered Zilberstein, Ido Ziv, Heli Baram, Nachum Dershowitz, Samuel Itzikowitz, Eiran Vadim Harel

Figure 1 for Semantic Characteristics of Schizophrenic Speech
Figure 2 for Semantic Characteristics of Schizophrenic Speech
Figure 3 for Semantic Characteristics of Schizophrenic Speech
Figure 4 for Semantic Characteristics of Schizophrenic Speech
Viaarxiv icon

Reduction of Subjective Listening Effort for TV Broadcast Signals with Recurrent Neural Networks

Nov 02, 2021
Nils L. Westhausen, Rainer Huber, Hannah Baumgartner, Ragini Sinha, Jan Rennies, Bernd T. Meyer

Figure 1 for Reduction of Subjective Listening Effort for TV Broadcast Signals with Recurrent Neural Networks
Figure 2 for Reduction of Subjective Listening Effort for TV Broadcast Signals with Recurrent Neural Networks
Figure 3 for Reduction of Subjective Listening Effort for TV Broadcast Signals with Recurrent Neural Networks
Figure 4 for Reduction of Subjective Listening Effort for TV Broadcast Signals with Recurrent Neural Networks
Viaarxiv icon

Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition

Dec 03, 2020
Genta Indra Winata, Guangsen Wang, Caiming Xiong, Steven Hoi

Figure 1 for Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition
Figure 2 for Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition
Figure 3 for Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition
Figure 4 for Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition
Viaarxiv icon