Alert button

"speech": models, code, and papers
Alert button

Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis

Sep 17, 2020
Yukiya Hono, Kazuna Tsuboi, Kei Sawada, Kei Hashimoto, Keiichiro Oura, Yoshihiko Nankaku, Keiichi Tokuda

Figure 1 for Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis
Figure 2 for Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis
Figure 3 for Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis
Figure 4 for Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis
Viaarxiv icon

Measuring Forgetting of Memorized Training Examples

Jun 30, 2022
Matthew Jagielski, Om Thakkar, Florian Tramèr, Daphne Ippolito, Katherine Lee, Nicholas Carlini, Eric Wallace, Shuang Song, Abhradeep Thakurta, Nicolas Papernot, Chiyuan Zhang

Figure 1 for Measuring Forgetting of Memorized Training Examples
Figure 2 for Measuring Forgetting of Memorized Training Examples
Figure 3 for Measuring Forgetting of Memorized Training Examples
Figure 4 for Measuring Forgetting of Memorized Training Examples
Viaarxiv icon

Transfer learning from High-Resource to Low-Resource Language Improves Speech Affect Recognition Classification Accuracy

Mar 04, 2021
Sara Durrani, Umair Arshad

Figure 1 for Transfer learning from High-Resource to Low-Resource Language Improves Speech Affect Recognition Classification Accuracy
Figure 2 for Transfer learning from High-Resource to Low-Resource Language Improves Speech Affect Recognition Classification Accuracy
Figure 3 for Transfer learning from High-Resource to Low-Resource Language Improves Speech Affect Recognition Classification Accuracy
Figure 4 for Transfer learning from High-Resource to Low-Resource Language Improves Speech Affect Recognition Classification Accuracy
Viaarxiv icon

R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS

Jun 30, 2022
Kyle Kastner, Aaron Courville

Figure 1 for R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
Figure 2 for R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
Figure 3 for R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
Figure 4 for R-MelNet: Reduced Mel-Spectral Modeling for Neural TTS
Viaarxiv icon

Self-Supervised Representations Improve End-to-End Speech Translation

Jun 22, 2020
Anne Wu, Changhan Wang, Juan Pino, Jiatao Gu

Figure 1 for Self-Supervised Representations Improve End-to-End Speech Translation
Figure 2 for Self-Supervised Representations Improve End-to-End Speech Translation
Figure 3 for Self-Supervised Representations Improve End-to-End Speech Translation
Figure 4 for Self-Supervised Representations Improve End-to-End Speech Translation
Viaarxiv icon

dictNN: A Dictionary-Enhanced CNN Approach for Classifying Hate Speech on Twitter

Mar 16, 2021
Maximilian Kupi, Michael Bodnar, Nikolas Schmidt, Carlos Eduardo Posada

Figure 1 for dictNN: A Dictionary-Enhanced CNN Approach for Classifying Hate Speech on Twitter
Figure 2 for dictNN: A Dictionary-Enhanced CNN Approach for Classifying Hate Speech on Twitter
Figure 3 for dictNN: A Dictionary-Enhanced CNN Approach for Classifying Hate Speech on Twitter
Figure 4 for dictNN: A Dictionary-Enhanced CNN Approach for Classifying Hate Speech on Twitter
Viaarxiv icon

ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers

May 17, 2020
Jung-Woo Ha, Kihyun Nam, Jingu Kang, Sang-Woo Lee, Sohee Yang, Hyunhoon Jung, Eunmi Kim, Hyeji Kim, Soojin Kim, Hyun Ah Kim, Kyoungtae Doh, Chan Kyu Lee, Nako Sung, Sunghun Kim

Figure 1 for ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers
Figure 2 for ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers
Figure 3 for ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers
Figure 4 for ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers
Viaarxiv icon

ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation

Dec 12, 2021
Holy Lovenia, Samuel Cahyawijaya, Genta Indra Winata, Peng Xu, Xu Yan, Zihan Liu, Rita Frieske, Tiezheng Yu, Wenliang Dai, Elham J. Barezi, Pascale Fung

Figure 1 for ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation
Figure 2 for ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation
Figure 3 for ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation
Figure 4 for ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation
Viaarxiv icon

ICASSP 2022 Acoustic Echo Cancellation Challenge

Feb 27, 2022
Ross Cutler, Ando Saabas, Tanel Parnamaa, Marju Purin, Hannes Gamper, Sebastian Braun, Karsten Sørensen, Robert Aichner

Figure 1 for ICASSP 2022 Acoustic Echo Cancellation Challenge
Figure 2 for ICASSP 2022 Acoustic Echo Cancellation Challenge
Figure 3 for ICASSP 2022 Acoustic Echo Cancellation Challenge
Figure 4 for ICASSP 2022 Acoustic Echo Cancellation Challenge
Viaarxiv icon

Call-sign recognition and understanding for noisy air-traffic transcripts using surveillance information

Apr 13, 2022
Alexander Blatt, Martin Kocour, Karel Veselý, Igor Szöke, Dietrich Klakow

Figure 1 for Call-sign recognition and understanding for noisy air-traffic transcripts using surveillance information
Figure 2 for Call-sign recognition and understanding for noisy air-traffic transcripts using surveillance information
Figure 3 for Call-sign recognition and understanding for noisy air-traffic transcripts using surveillance information
Figure 4 for Call-sign recognition and understanding for noisy air-traffic transcripts using surveillance information
Viaarxiv icon