Alert button

"speech": models, code, and papers
Alert button

Microphone Subset Selection for the Weighted Prediction Error Algorithm using a Group Sparsity Penalty

Jan 16, 2024
Anselm Lohmann, Toon van Waterschoot, Joerg Bitzer, Simon Doclo

Viaarxiv icon

Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis

Nov 20, 2023
Jungil Kong, Junmo Lee, Jeongmin Kim, Beomjeong Kim, Jihoon Park, Dohee Kong, Changheon Lee, Sangjin Kim

Viaarxiv icon

Multimodal Speech Emotion Recognition Using Modality-specific Self-Supervised Frameworks

Dec 04, 2023
Rutherford Agbeshi Patamia, Paulo E. Santos, Kingsley Nketia Acheampong, Favour Ekong, Kwabena Sarpong, She Kun

Viaarxiv icon

Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks, Methods, and Applications

Nov 30, 2023
Karren D. Yang, Anurag Ranjan, Jen-Hao Rick Chang, Raviteja Vemulapalli, Oncel Tuzel

Viaarxiv icon

Self-Attention and Hybrid Features for Replay and Deep-Fake Audio Detection

Jan 11, 2024
Lian Huang, Chi-Man Pun

Viaarxiv icon

E-chat: Emotion-sensitive Spoken Dialogue System with Large Language Models

Dec 31, 2023
Hongfei Xue, Yuhao Liang, Bingshen Mu, Shiliang Zhang, Qian Chen, Lei Xie

Viaarxiv icon

DurFlex-EVC: Duration-Flexible Emotional Voice Conversion with Parallel Generation

Jan 16, 2024
Hyoung-Seok Oh, Sang-Hoon Lee, Deok-Hyun Cho, Seong-Whan Lee

Viaarxiv icon

DiarizationLM: Speaker Diarization Post-Processing with Large Language Models

Jan 16, 2024
Quan Wang, Yiling Huang, Guanlong Zhao, Evan Clark, Wei Xia, Hank Liao

Viaarxiv icon

Integrating Plug-and-Play Data Priors with Weighted Prediction Error for Speech Dereverberation

Dec 05, 2023
Ziye Yang, Wenxing Yang, Kai Xie, Jie Chen

Viaarxiv icon

GWPT: A Green Word-Embedding-based POS Tagger

Jan 15, 2024
Chengwei Wei, Runqi Pang, C. -C. Jay Kuo

Viaarxiv icon