Alert button

"speech": models, code, and papers
Alert button

Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks

Add code
Bookmark button
Alert button
Apr 16, 2019
Ryan Eloff, André Nortje, Benjamin van Niekerk, Avashna Govender, Leanne Nortje, Arnu Pretorius, Elan van Biljon, Ewald van der Westhuizen, Lisa van Staden, Herman Kamper

Figure 1 for Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks
Figure 2 for Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks
Figure 3 for Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks
Viaarxiv icon

The NTNU System at the Interspeech 2020 Non-Native Children's Speech ASR Challenge

May 18, 2020
Tien-Hong Lo, Fu-An Chao, Shi-Yan Weng, Berlin Chen

Figure 1 for The NTNU System at the Interspeech 2020 Non-Native Children's Speech ASR Challenge
Figure 2 for The NTNU System at the Interspeech 2020 Non-Native Children's Speech ASR Challenge
Figure 3 for The NTNU System at the Interspeech 2020 Non-Native Children's Speech ASR Challenge
Figure 4 for The NTNU System at the Interspeech 2020 Non-Native Children's Speech ASR Challenge
Viaarxiv icon

A Fast Network Exploration Strategy to Profile Low Energy Consumption for Keyword Spotting

Feb 04, 2022
Arnab Neelim Mazumder, Tinoosh Mohsenin

Figure 1 for A Fast Network Exploration Strategy to Profile Low Energy Consumption for Keyword Spotting
Figure 2 for A Fast Network Exploration Strategy to Profile Low Energy Consumption for Keyword Spotting
Figure 3 for A Fast Network Exploration Strategy to Profile Low Energy Consumption for Keyword Spotting
Figure 4 for A Fast Network Exploration Strategy to Profile Low Energy Consumption for Keyword Spotting
Viaarxiv icon

A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild

Add code
Bookmark button
Alert button
Aug 23, 2020
K R Prajwal, Rudrabha Mukhopadhyay, Vinay Namboodiri, C V Jawahar

Figure 1 for A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild
Figure 2 for A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild
Figure 3 for A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild
Figure 4 for A Lip Sync Expert Is All You Need for Speech to Lip Generation In The Wild
Viaarxiv icon

Emotion Recognition in Speech using Cross-Modal Transfer in the Wild

Add code
Bookmark button
Alert button
Aug 16, 2018
Samuel Albanie, Arsha Nagrani, Andrea Vedaldi, Andrew Zisserman

Figure 1 for Emotion Recognition in Speech using Cross-Modal Transfer in the Wild
Figure 2 for Emotion Recognition in Speech using Cross-Modal Transfer in the Wild
Figure 3 for Emotion Recognition in Speech using Cross-Modal Transfer in the Wild
Figure 4 for Emotion Recognition in Speech using Cross-Modal Transfer in the Wild
Viaarxiv icon

DNN-Based Semantic Model for Rescoring N-best Speech Recognition List

Nov 02, 2020
Dominique Fohr, Irina Illina

Figure 1 for DNN-Based Semantic Model for Rescoring N-best Speech Recognition List
Figure 2 for DNN-Based Semantic Model for Rescoring N-best Speech Recognition List
Figure 3 for DNN-Based Semantic Model for Rescoring N-best Speech Recognition List
Figure 4 for DNN-Based Semantic Model for Rescoring N-best Speech Recognition List
Viaarxiv icon

Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models

Add code
Bookmark button
Alert button
Jul 20, 2021
Tianzi Wang, Yuya Fujita, Xuankai Chang, Shinji Watanabe

Figure 1 for Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models
Figure 2 for Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models
Figure 3 for Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models
Figure 4 for Streaming End-to-End ASR based on Blockwise Non-Autoregressive Models
Viaarxiv icon

Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems

Add code
Bookmark button
Alert button
Aug 11, 2020
Ravichander Vipperla, Sangjun Park, Kihyun Choo, Samin Ishtiaq, Kyoungbo Min, Sourav Bhattacharya, Abhinav Mehrotra, Alberto Gil C. P. Ramos, Nicholas D. Lane

Figure 1 for Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems
Figure 2 for Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems
Figure 3 for Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems
Figure 4 for Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems
Viaarxiv icon

The Mirrornet : Learning Audio Synthesizer Controls Inspired by Sensorimotor Interaction

Add code
Bookmark button
Alert button
Oct 12, 2021
Yashish M. Siriwardena, Guilhem Marion, Shihab Shamma

Figure 1 for The Mirrornet : Learning Audio Synthesizer Controls Inspired by Sensorimotor Interaction
Figure 2 for The Mirrornet : Learning Audio Synthesizer Controls Inspired by Sensorimotor Interaction
Figure 3 for The Mirrornet : Learning Audio Synthesizer Controls Inspired by Sensorimotor Interaction
Figure 4 for The Mirrornet : Learning Audio Synthesizer Controls Inspired by Sensorimotor Interaction
Viaarxiv icon

Persian Ezafe Recognition Using Transformers and Its Role in Part-Of-Speech Tagging

Add code
Bookmark button
Alert button
Oct 04, 2020
Ehsan Doostmohammadi, Minoo Nassajian, Adel Rahimi

Figure 1 for Persian Ezafe Recognition Using Transformers and Its Role in Part-Of-Speech Tagging
Figure 2 for Persian Ezafe Recognition Using Transformers and Its Role in Part-Of-Speech Tagging
Figure 3 for Persian Ezafe Recognition Using Transformers and Its Role in Part-Of-Speech Tagging
Figure 4 for Persian Ezafe Recognition Using Transformers and Its Role in Part-Of-Speech Tagging
Viaarxiv icon