Alert button

"speech": models, code, and papers
Alert button

Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data

Add code
Bookmark button
Alert button
Mar 31, 2022
Junyi Ao, Ziqiang Zhang, Long Zhou, Shujie Liu, Haizhou Li, Tom Ko, Lirong Dai, Jinyu Li, Yao Qian, Furu Wei

Figure 1 for Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Figure 2 for Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Figure 3 for Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Figure 4 for Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Viaarxiv icon

Whose Emotion Matters? Speaker Detection without Prior Knowledge

Add code
Bookmark button
Alert button
Dec 08, 2022
Hugo Carneiro, Cornelius Weber, Stefan Wermter

Figure 1 for Whose Emotion Matters? Speaker Detection without Prior Knowledge
Figure 2 for Whose Emotion Matters? Speaker Detection without Prior Knowledge
Figure 3 for Whose Emotion Matters? Speaker Detection without Prior Knowledge
Figure 4 for Whose Emotion Matters? Speaker Detection without Prior Knowledge
Viaarxiv icon

DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

Add code
Bookmark button
Alert button
Jan 28, 2022
Songxiang Liu, Dan Su, Dong Yu

Figure 1 for DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Figure 2 for DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Figure 3 for DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Figure 4 for DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Viaarxiv icon

Speech Resources in the Tamasheq Language

Add code
Bookmark button
Alert button
Jan 14, 2022
Marcely Zanon Boito, Fethi Bougares, Florentin Barbier, Souhir Gahbiche, Loïc Barrault, Mickael Rouvier, Yannick Estève

Figure 1 for Speech Resources in the Tamasheq Language
Figure 2 for Speech Resources in the Tamasheq Language
Figure 3 for Speech Resources in the Tamasheq Language
Figure 4 for Speech Resources in the Tamasheq Language
Viaarxiv icon

Automated speech tools for helping communities process restricted-access corpora for language revival efforts

Add code
Bookmark button
Alert button
Apr 15, 2022
Nay San, Martijn Bartelds, Tolúl\d{o}p\dé Ògúnr\dèmí, Alison Mount, Ruben Thompson, Michael Higgins, Roy Barker, Jane Simpson, Dan Jurafsky

Figure 1 for Automated speech tools for helping communities process restricted-access corpora for language revival efforts
Figure 2 for Automated speech tools for helping communities process restricted-access corpora for language revival efforts
Figure 3 for Automated speech tools for helping communities process restricted-access corpora for language revival efforts
Figure 4 for Automated speech tools for helping communities process restricted-access corpora for language revival efforts
Viaarxiv icon

MixupE: Understanding and Improving Mixup from Directional Derivative Perspective

Add code
Bookmark button
Alert button
Dec 29, 2022
Vikas Verma, Sarthak Mittal, Wai Hoh Tang, Hieu Pham, Juho Kannala, Yoshua Bengio, Arno Solin, Kenji Kawaguchi

Figure 1 for MixupE: Understanding and Improving Mixup from Directional Derivative Perspective
Figure 2 for MixupE: Understanding and Improving Mixup from Directional Derivative Perspective
Figure 3 for MixupE: Understanding and Improving Mixup from Directional Derivative Perspective
Figure 4 for MixupE: Understanding and Improving Mixup from Directional Derivative Perspective
Viaarxiv icon

In search of strong embedding extractors for speaker diarisation

Oct 26, 2022
Jee-weon Jung, Hee-Soo Heo, Bong-Jin Lee, Jaesung Huh, Andrew Brown, Youngki Kwon, Shinji Watanabe, Joon Son Chung

Figure 1 for In search of strong embedding extractors for speaker diarisation
Figure 2 for In search of strong embedding extractors for speaker diarisation
Figure 3 for In search of strong embedding extractors for speaker diarisation
Viaarxiv icon

Modelling low-resource accents without accent-specific TTS frontend

Jan 11, 2023
Georgi Tinchev, Marta Czarnowska, Kamil Deja, Kayoko Yanagisawa, Marius Cotescu

Figure 1 for Modelling low-resource accents without accent-specific TTS frontend
Figure 2 for Modelling low-resource accents without accent-specific TTS frontend
Figure 3 for Modelling low-resource accents without accent-specific TTS frontend
Figure 4 for Modelling low-resource accents without accent-specific TTS frontend
Viaarxiv icon

Norm of word embedding encodes information gain

Add code
Bookmark button
Alert button
Dec 19, 2022
Momose Oyama, Sho Yokoi, Hidetoshi Shimodaira

Figure 1 for Norm of word embedding encodes information gain
Figure 2 for Norm of word embedding encodes information gain
Figure 3 for Norm of word embedding encodes information gain
Figure 4 for Norm of word embedding encodes information gain
Viaarxiv icon

On the Utility of Self-supervised Models for Prosody-related Tasks

Add code
Bookmark button
Alert button
Oct 13, 2022
Guan-Ting Lin, Chi-Luen Feng, Wei-Ping Huang, Yuan Tseng, Tzu-Han Lin, Chen-An Li, Hung-yi Lee, Nigel G. Ward

Figure 1 for On the Utility of Self-supervised Models for Prosody-related Tasks
Figure 2 for On the Utility of Self-supervised Models for Prosody-related Tasks
Figure 3 for On the Utility of Self-supervised Models for Prosody-related Tasks
Figure 4 for On the Utility of Self-supervised Models for Prosody-related Tasks
Viaarxiv icon