Alert button

"speech": models, code, and papers
Alert button

Performance of data-driven inner speech decoding with same-task EEG-fMRI data fusion and bimodal models

Jun 19, 2023
Holly Wilson, Scott Wellington, Foteini Simistira Liwicki, Vibha Gupta, Rajkumar Saini, Kanjar De, Nosheen Abid, Sumit Rakesh, Johan Eriksson, Oliver Watts, Xi Chen, Mohammad Golbabaee, Michael J. Proulx, Marcus Liwicki, Eamonn O'Neill, Benjamin Metcalfe

Figure 1 for Performance of data-driven inner speech decoding with same-task EEG-fMRI data fusion and bimodal models
Figure 2 for Performance of data-driven inner speech decoding with same-task EEG-fMRI data fusion and bimodal models
Figure 3 for Performance of data-driven inner speech decoding with same-task EEG-fMRI data fusion and bimodal models
Figure 4 for Performance of data-driven inner speech decoding with same-task EEG-fMRI data fusion and bimodal models
Viaarxiv icon

A Study on the Reliability of Automatic Dysarthric Speech Assessments

Jun 07, 2023
Xavier F. Cadet, Ranya Aloufi, Sara Ahmadi-Abhari, Hamed Haddadi

Figure 1 for A Study on the Reliability of Automatic Dysarthric Speech Assessments
Figure 2 for A Study on the Reliability of Automatic Dysarthric Speech Assessments
Figure 3 for A Study on the Reliability of Automatic Dysarthric Speech Assessments
Figure 4 for A Study on the Reliability of Automatic Dysarthric Speech Assessments
Viaarxiv icon

In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised Representations and Neural Vocoder-based Resynthesis

Jun 02, 2023
Navin Raj Prabhu, Nale Lehmann-Willenbrock, Timo Gerkmann

Figure 1 for In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised Representations and Neural Vocoder-based Resynthesis
Figure 2 for In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised Representations and Neural Vocoder-based Resynthesis
Figure 3 for In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised Representations and Neural Vocoder-based Resynthesis
Figure 4 for In-the-wild Speech Emotion Conversion Using Disentangled Self-Supervised Representations and Neural Vocoder-based Resynthesis
Viaarxiv icon

Tagged End-to-End Simultaneous Speech Translation Training using Simultaneous Interpretation Data

Jun 14, 2023
Yuka Ko, Ryo Fukuda, Yuta Nishikawa, Yasumasa Kano, Katsuhito Sudoh, Satoshi Nakamura

Figure 1 for Tagged End-to-End Simultaneous Speech Translation Training using Simultaneous Interpretation Data
Figure 2 for Tagged End-to-End Simultaneous Speech Translation Training using Simultaneous Interpretation Data
Figure 3 for Tagged End-to-End Simultaneous Speech Translation Training using Simultaneous Interpretation Data
Figure 4 for Tagged End-to-End Simultaneous Speech Translation Training using Simultaneous Interpretation Data
Viaarxiv icon

QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation

May 18, 2023
Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Haolin Zhuang

Figure 1 for QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation
Figure 2 for QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation
Figure 3 for QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation
Figure 4 for QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation
Viaarxiv icon

Developing Social Robots with Empathetic Non-Verbal Cues Using Large Language Models

Aug 31, 2023
Yoon Kyung Lee, Yoonwon Jung, Gyuyi Kang, Sowon Hahn

Viaarxiv icon

iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN

Aug 14, 2023
Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki

Figure 1 for iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN
Figure 2 for iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN
Figure 3 for iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN
Figure 4 for iSTFTNet2: Faster and More Lightweight iSTFT-Based Neural Vocoder Using 1D-2D CNN
Viaarxiv icon

Spread Control Method on Unknown Networks Based on Hierarchical Reinforcement Learning

Aug 28, 2023
Wenxiang Dong, H. Vicky Zhao

Figure 1 for Spread Control Method on Unknown Networks Based on Hierarchical Reinforcement Learning
Figure 2 for Spread Control Method on Unknown Networks Based on Hierarchical Reinforcement Learning
Figure 3 for Spread Control Method on Unknown Networks Based on Hierarchical Reinforcement Learning
Viaarxiv icon

The ART of Conversation: Measuring Phonetic Convergence and Deliberate Imitation in L2-Speech with a Siamese RNN

Jun 08, 2023
Zheng Yuan, Aldo Pastore, Dorina de Jong, Hao Xu, Luciano Fadiga, Alessandro D'Ausilio

Figure 1 for The ART of Conversation: Measuring Phonetic Convergence and Deliberate Imitation in L2-Speech with a Siamese RNN
Figure 2 for The ART of Conversation: Measuring Phonetic Convergence and Deliberate Imitation in L2-Speech with a Siamese RNN
Figure 3 for The ART of Conversation: Measuring Phonetic Convergence and Deliberate Imitation in L2-Speech with a Siamese RNN
Figure 4 for The ART of Conversation: Measuring Phonetic Convergence and Deliberate Imitation in L2-Speech with a Siamese RNN
Viaarxiv icon

Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data

May 25, 2023
Takafumi Moriya, Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Takanori Ashihara, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura, Atsunori Ogawa, Taichi Asami

Figure 1 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 2 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 3 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Figure 4 for Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data
Viaarxiv icon