Alert button

"speech": models, code, and papers
Alert button

Cross-Lingual Transfer Learning for Alzheimer's Detection From Spontaneous Speech

Mar 06, 2023
Bastiaan Tamm, Rik Vandenberghe, Hugo Van hamme

Viaarxiv icon

GujiBERT and GujiGPT: Construction of Intelligent Information Processing Foundation Language Models for Ancient Texts

Jul 11, 2023
Dongbo Wang, Chang Liu, Zhixiao Zhao, Si Shen, Liu Liu, Bin Li, Haotian Hu, Mengcheng Wu, Litao Lin, Xue Zhao, Xiyu Wang

Figure 1 for GujiBERT and GujiGPT: Construction of Intelligent Information Processing Foundation Language Models for Ancient Texts
Figure 2 for GujiBERT and GujiGPT: Construction of Intelligent Information Processing Foundation Language Models for Ancient Texts
Figure 3 for GujiBERT and GujiGPT: Construction of Intelligent Information Processing Foundation Language Models for Ancient Texts
Figure 4 for GujiBERT and GujiGPT: Construction of Intelligent Information Processing Foundation Language Models for Ancient Texts
Viaarxiv icon

VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation

May 09, 2023
Yuanda Wang, Hanqing Guo, Guangjing Wang, Bocheng Chen, Qiben Yan

Figure 1 for VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation
Figure 2 for VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation
Figure 3 for VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation
Figure 4 for VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation
Viaarxiv icon

Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition

Feb 22, 2023
Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng

Figure 1 for Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition
Figure 2 for Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition
Figure 3 for Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition
Figure 4 for Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition
Viaarxiv icon

VIFS: An End-to-End Variational Inference for Foley Sound Synthesis

Jun 08, 2023
Junhyeok Lee, Hyeonuk Nam, Yong-Hwa Park

Figure 1 for VIFS: An End-to-End Variational Inference for Foley Sound Synthesis
Figure 2 for VIFS: An End-to-End Variational Inference for Foley Sound Synthesis
Figure 3 for VIFS: An End-to-End Variational Inference for Foley Sound Synthesis
Figure 4 for VIFS: An End-to-End Variational Inference for Foley Sound Synthesis
Viaarxiv icon

MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation

Jun 28, 2023
Jun Chen, Wei Rao, Zilin Wang, Jiuxin Lin, Yukai Ju, Shulin He, Yannan Wang, Zhiyong Wu

Figure 1 for MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation
Figure 2 for MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation
Figure 3 for MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation
Figure 4 for MC-SpEx: Towards Effective Speaker Extraction with Multi-Scale Interfusion and Conditional Speaker Modulation
Viaarxiv icon

Taqyim: Evaluating Arabic NLP Tasks Using ChatGPT Models

Jun 28, 2023
Zaid Alyafeai, Maged S. Alshaibani, Badr AlKhamissi, Hamzah Luqman, Ebrahim Alareqi, Ali Fadel

Viaarxiv icon

Evaluating Automatic Speech Recognition in an Incremental Setting

Feb 23, 2023
Ryan Whetten, Mir Tahsin Imtiaz, Casey Kennington

Figure 1 for Evaluating Automatic Speech Recognition in an Incremental Setting
Figure 2 for Evaluating Automatic Speech Recognition in an Incremental Setting
Figure 3 for Evaluating Automatic Speech Recognition in an Incremental Setting
Figure 4 for Evaluating Automatic Speech Recognition in an Incremental Setting
Viaarxiv icon

Synthesizing audio from tongue motion during speech using tagged MRI via transformer

Feb 14, 2023
Xiaofeng Liu, Fangxu Xing, Jerry L. Prince, Maureen Stone, Georges El Fakhri, Jonghye Woo

Figure 1 for Synthesizing audio from tongue motion during speech using tagged MRI via transformer
Figure 2 for Synthesizing audio from tongue motion during speech using tagged MRI via transformer
Figure 3 for Synthesizing audio from tongue motion during speech using tagged MRI via transformer
Viaarxiv icon

Using Deepfake Technologies for Word Emphasis Detection

May 12, 2023
Eran Kaufman, Lee-Ad Gottlieb

Figure 1 for Using Deepfake Technologies for Word Emphasis Detection
Figure 2 for Using Deepfake Technologies for Word Emphasis Detection
Figure 3 for Using Deepfake Technologies for Word Emphasis Detection
Figure 4 for Using Deepfake Technologies for Word Emphasis Detection
Viaarxiv icon