Alert button

"speech": models, code, and papers
Alert button

Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model

Add code
Bookmark button
Alert button
Apr 07, 2022
Nick J. C. Wang, Lu Wang, Yandan Sun, Haimei Kang, Dejun Zhang

Figure 1 for Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model
Figure 2 for Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model
Figure 3 for Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model
Figure 4 for Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model
Viaarxiv icon

Arabic Code-Switching Speech Recognition using Monolingual Data

Jul 04, 2021
Ahmed Ali, Shammur Chowdhury, Amir Hussein, Yasser Hifny

Figure 1 for Arabic Code-Switching Speech Recognition using Monolingual Data
Figure 2 for Arabic Code-Switching Speech Recognition using Monolingual Data
Figure 3 for Arabic Code-Switching Speech Recognition using Monolingual Data
Figure 4 for Arabic Code-Switching Speech Recognition using Monolingual Data
Viaarxiv icon

Unsupervised Word Segmentation using K Nearest Neighbors

Add code
Bookmark button
Alert button
Apr 27, 2022
Tzeviya Sylvia Fuchs, Yedid Hoshen, Joseph Keshet

Figure 1 for Unsupervised Word Segmentation using K Nearest Neighbors
Figure 2 for Unsupervised Word Segmentation using K Nearest Neighbors
Figure 3 for Unsupervised Word Segmentation using K Nearest Neighbors
Figure 4 for Unsupervised Word Segmentation using K Nearest Neighbors
Viaarxiv icon

Gender Representation in Open Source Speech Resources

Add code
Bookmark button
Alert button
Mar 18, 2020
Mahault Garnerin, Solange Rossato, Laurent Besacier

Figure 1 for Gender Representation in Open Source Speech Resources
Figure 2 for Gender Representation in Open Source Speech Resources
Figure 3 for Gender Representation in Open Source Speech Resources
Figure 4 for Gender Representation in Open Source Speech Resources
Viaarxiv icon

Time-Frequency Localization Using Deep Convolutional Maxout Neural Network in Persian Speech Recognition

Sep 06, 2021
Arash Dehghani, Seyyed Ali Seyyedsalehi

Figure 1 for Time-Frequency Localization Using Deep Convolutional Maxout Neural Network in Persian Speech Recognition
Figure 2 for Time-Frequency Localization Using Deep Convolutional Maxout Neural Network in Persian Speech Recognition
Figure 3 for Time-Frequency Localization Using Deep Convolutional Maxout Neural Network in Persian Speech Recognition
Figure 4 for Time-Frequency Localization Using Deep Convolutional Maxout Neural Network in Persian Speech Recognition
Viaarxiv icon

WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models

Add code
Bookmark button
Alert button
Mar 29, 2022
Heting Gao, Junrui Ni, Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson

Figure 1 for WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
Figure 2 for WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
Figure 3 for WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
Figure 4 for WAVPROMPT: Towards Few-Shot Spoken Language Understanding with Frozen Language Models
Viaarxiv icon

Controllable neural text-to-speech synthesis using intuitive prosodic features

Add code
Bookmark button
Alert button
Sep 14, 2020
Tuomo Raitio, Ramya Rasipuram, Dan Castellani

Figure 1 for Controllable neural text-to-speech synthesis using intuitive prosodic features
Figure 2 for Controllable neural text-to-speech synthesis using intuitive prosodic features
Figure 3 for Controllable neural text-to-speech synthesis using intuitive prosodic features
Figure 4 for Controllable neural text-to-speech synthesis using intuitive prosodic features
Viaarxiv icon

StableFace: Analyzing and Improving Motion Stability for Talking Face Generation

Add code
Bookmark button
Alert button
Aug 29, 2022
Jun Ling, Xu Tan, Liyang Chen, Runnan Li, Yuchao Zhang, Sheng Zhao, Li Song

Figure 1 for StableFace: Analyzing and Improving Motion Stability for Talking Face Generation
Figure 2 for StableFace: Analyzing and Improving Motion Stability for Talking Face Generation
Figure 3 for StableFace: Analyzing and Improving Motion Stability for Talking Face Generation
Figure 4 for StableFace: Analyzing and Improving Motion Stability for Talking Face Generation
Viaarxiv icon

VoiceMe: Personalized voice generation in TTS

Add code
Bookmark button
Alert button
Mar 29, 2022
Pol van Rijn, Silvan Mertes, Dominik Schiller, Piotr Dura, Hubert Siuzdak, Peter M. C. Harrison, Elisabeth André, Nori Jacoby

Figure 1 for VoiceMe: Personalized voice generation in TTS
Figure 2 for VoiceMe: Personalized voice generation in TTS
Figure 3 for VoiceMe: Personalized voice generation in TTS
Viaarxiv icon

Multi-Task Learning with Sentiment, Emotion, and Target Detection to Recognize Hate Speech and Offensive Language

Add code
Bookmark button
Alert button
Oct 13, 2021
Flor Miriam Plaza-del-Arco, Sercan Halat, Sebastian Padó, Roman Klinger

Figure 1 for Multi-Task Learning with Sentiment, Emotion, and Target Detection to Recognize Hate Speech and Offensive Language
Figure 2 for Multi-Task Learning with Sentiment, Emotion, and Target Detection to Recognize Hate Speech and Offensive Language
Figure 3 for Multi-Task Learning with Sentiment, Emotion, and Target Detection to Recognize Hate Speech and Offensive Language
Figure 4 for Multi-Task Learning with Sentiment, Emotion, and Target Detection to Recognize Hate Speech and Offensive Language
Viaarxiv icon