Alert button

"speech": models, code, and papers
Alert button

A Novel Interpretable and Generalizable Re-synchronization Model for Cued Speech based on a Multi-Cuer Corpus

Add code
Bookmark button
Alert button
Jun 05, 2023
Lufei Gao, Shan Huang, Li Liu

Figure 1 for A Novel Interpretable and Generalizable Re-synchronization Model for Cued Speech based on a Multi-Cuer Corpus
Figure 2 for A Novel Interpretable and Generalizable Re-synchronization Model for Cued Speech based on a Multi-Cuer Corpus
Figure 3 for A Novel Interpretable and Generalizable Re-synchronization Model for Cued Speech based on a Multi-Cuer Corpus
Figure 4 for A Novel Interpretable and Generalizable Re-synchronization Model for Cued Speech based on a Multi-Cuer Corpus
Viaarxiv icon

Flowchase: a Mobile Application for Pronunciation Training

Jul 05, 2023
Noé Tits, Zoé Broisson

Figure 1 for Flowchase: a Mobile Application for Pronunciation Training
Figure 2 for Flowchase: a Mobile Application for Pronunciation Training
Figure 3 for Flowchase: a Mobile Application for Pronunciation Training
Viaarxiv icon

Multi-perspective Information Fusion Res2Net with RandomSpecmix for Fake Speech Detection

Jun 27, 2023
Shunbo Dong, Jun Xue, Cunhang Fan, Kang Zhu, Yujie Chen, Zhao Lv

Figure 1 for Multi-perspective Information Fusion Res2Net with RandomSpecmix for Fake Speech Detection
Figure 2 for Multi-perspective Information Fusion Res2Net with RandomSpecmix for Fake Speech Detection
Figure 3 for Multi-perspective Information Fusion Res2Net with RandomSpecmix for Fake Speech Detection
Figure 4 for Multi-perspective Information Fusion Res2Net with RandomSpecmix for Fake Speech Detection
Viaarxiv icon

A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation

Add code
Bookmark button
Alert button
Aug 17, 2023
Li Liu, Lufei Gao, Wentao Lei, Fengji Ma, Xiaotian Lin, Jinting Wang

Figure 1 for A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation
Figure 2 for A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation
Figure 3 for A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation
Figure 4 for A Survey on Deep Multi-modal Learning for Body Language Recognition and Generation
Viaarxiv icon

Federated learning for secure development of AI models for Parkinson's disease detection using speech from different languages

May 18, 2023
Soroosh Tayebi Arasteh, Cristian David Rios-Urrego, Elmar Noeth, Andreas Maier, Seung Hee Yang, Jan Rusz, Juan Rafael Orozco-Arroyave

Figure 1 for Federated learning for secure development of AI models for Parkinson's disease detection using speech from different languages
Figure 2 for Federated learning for secure development of AI models for Parkinson's disease detection using speech from different languages
Figure 3 for Federated learning for secure development of AI models for Parkinson's disease detection using speech from different languages
Figure 4 for Federated learning for secure development of AI models for Parkinson's disease detection using speech from different languages
Viaarxiv icon

ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings

Add code
Bookmark button
Alert button
May 23, 2023
Yuki Saito, Shinnosuke Takamichi, Eiji Iimori, Kentaro Tachibana, Hiroshi Saruwatari

Figure 1 for ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings
Figure 2 for ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings
Figure 3 for ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings
Figure 4 for ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings
Viaarxiv icon

Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis

Add code
Bookmark button
Alert button
Apr 13, 2023
Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Shiyin Kang, Helen Meng

Figure 1 for Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Figure 2 for Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Figure 3 for Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Figure 4 for Context-aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis
Viaarxiv icon

Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Sample Size Estimation and Reducing Overfitting

Aug 30, 2023
Hamzeh Ghasemzadeh, Robert E. Hillman, Daryush D. Mehta

Figure 1 for Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Sample Size Estimation and Reducing Overfitting
Figure 2 for Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Sample Size Estimation and Reducing Overfitting
Figure 3 for Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Sample Size Estimation and Reducing Overfitting
Figure 4 for Toward Generalizable Machine Learning Models in Speech, Language, and Hearing Sciences: Sample Size Estimation and Reducing Overfitting
Viaarxiv icon

Parts of Speech-Grounded Subspaces in Vision-Language Models

Add code
Bookmark button
Alert button
May 23, 2023
James Oldfield, Christos Tzelepis, Yannis Panagakis, Mihalis A. Nicolaou, Ioannis Patras

Figure 1 for Parts of Speech-Grounded Subspaces in Vision-Language Models
Figure 2 for Parts of Speech-Grounded Subspaces in Vision-Language Models
Figure 3 for Parts of Speech-Grounded Subspaces in Vision-Language Models
Figure 4 for Parts of Speech-Grounded Subspaces in Vision-Language Models
Viaarxiv icon

Predicting EEG Responses to Attended Speech via Deep Neural Networks for Speech

Feb 27, 2023
Emina Alickovic, Tobias Dorszewski, Thomas U. Christiansen, Kasper Eskelund, Leonardo Gizzi, Martin A. Skoglund, Dorothea Wendt

Figure 1 for Predicting EEG Responses to Attended Speech via Deep Neural Networks for Speech
Figure 2 for Predicting EEG Responses to Attended Speech via Deep Neural Networks for Speech
Figure 3 for Predicting EEG Responses to Attended Speech via Deep Neural Networks for Speech
Figure 4 for Predicting EEG Responses to Attended Speech via Deep Neural Networks for Speech
Viaarxiv icon