Alert button

"speech": models, code, and papers
Alert button

FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion

Sep 20, 2023
Stefan Stan, Kazi Injamamul Haque, Zerrin Yumak

Figure 1 for FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion
Figure 2 for FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion
Figure 3 for FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion
Figure 4 for FaceDiffuser: Speech-Driven 3D Facial Animation Synthesis Using Diffusion
Viaarxiv icon

Zipformer: A faster and better encoder for automatic speech recognition

Oct 17, 2023
Zengwei Yao, Liyong Guo, Xiaoyu Yang, Wei Kang, Fangjun Kuang, Yifan Yang, Zengrui Jin, Long Lin, Daniel Povey

Viaarxiv icon

Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization

Nov 17, 2023
Xiaojiao Chen, Sheng Li, Jiyi Li, Hao Huang, Yang Cao, Liang He

Figure 1 for Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization
Figure 2 for Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization
Figure 3 for Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization
Figure 4 for Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization
Viaarxiv icon

Generative Adversarial Training for Text-to-Speech Synthesis Based on Raw Phonetic Input and Explicit Prosody Modelling

Oct 14, 2023
Tiberiu Boros, Stefan Daniel Dumitrescu, Ionut Mironica, Radu Chivereanu

Viaarxiv icon

Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition

Oct 06, 2023
Yan Zhao, Yuan Zong, Jincen Wang, Hailun Lian, Cheng Lu, Li Zhao, Wenming Zheng

Figure 1 for Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition
Figure 2 for Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition
Figure 3 for Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition
Figure 4 for Layer-Adapted Implicit Distribution Alignment Networks for Cross-Corpus Speech Emotion Recognition
Viaarxiv icon

Multi-teacher Distillation for Multilingual Spelling Correction

Nov 20, 2023
Jingfen Zhang, Xuan Guo, Sravan Bodapati, Christopher Potts

Viaarxiv icon

Complexity Scaling for Speech Denoising

Sep 14, 2023
Hangting Chen, Jianwei Yu, Chao Weng

Figure 1 for Complexity Scaling for Speech Denoising
Figure 2 for Complexity Scaling for Speech Denoising
Figure 3 for Complexity Scaling for Speech Denoising
Figure 4 for Complexity Scaling for Speech Denoising
Viaarxiv icon

Soft Random Sampling: A Theoretical and Empirical Analysis

Nov 21, 2023
Xiaodong Cui, Ashish Mittal, Songtao Lu, Wei Zhang, George Saon, Brian Kingsbury

Viaarxiv icon

RAUCG: Retrieval-Augmented Unsupervised Counter Narrative Generation for Hate Speech

Oct 09, 2023
Shuyu Jiang, Wenyi Tang, Xingshu Chen, Rui Tanga, Haizhou Wang, Wenxian Wang

Figure 1 for RAUCG: Retrieval-Augmented Unsupervised Counter Narrative Generation for Hate Speech
Figure 2 for RAUCG: Retrieval-Augmented Unsupervised Counter Narrative Generation for Hate Speech
Figure 3 for RAUCG: Retrieval-Augmented Unsupervised Counter Narrative Generation for Hate Speech
Figure 4 for RAUCG: Retrieval-Augmented Unsupervised Counter Narrative Generation for Hate Speech
Viaarxiv icon

VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System

Oct 17, 2023
Abdul Waheed, Bashar Talafha, Peter Suvellin, Abdelrahman Elmadney, Muhammad Abdul-Mageed

Figure 1 for VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Figure 2 for VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Figure 3 for VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Figure 4 for VoxArabica: A Robust Dialect-Aware Arabic Speech Recognition System
Viaarxiv icon