Alert button

"speech": models, code, and papers
Alert button

Post-Training Embedding Alignment for Decoupling Enrollment and Runtime Speaker Recognition Models

Jan 23, 2024
Chenyang Gao, Brecht Desplanques, Chelsea J. -T. Ju, Aman Chadha, Andreas Stolcke

Viaarxiv icon

Investigating salient representations and label Variance in Dimensional Speech Emotion Analysis

Dec 17, 2023
Vikramjit Mitra, Jingping Nie, Erdrin Azemi

Viaarxiv icon

Enhancement of a Text-Independent Speaker Verification System by using Feature Combination and Parallel-Structure Classifiers

Jan 26, 2024
Kerlos Atia Abdalmalak, Ascensión Gallardo-Antol'in

Viaarxiv icon

Distributed Speech Dereverberation Using Weighted Prediction Error

Dec 05, 2023
Ziye Yang, Mengfei Zhang, Jie Chen

Viaarxiv icon

Ultra-lightweight Neural Differential DSP Vocoder For High Quality Speech Synthesis

Jan 19, 2024
Prabhav Agrawal, Thilo Koehler, Zhiping Xiu, Prashant Serai, Qing He

Viaarxiv icon

Deep Photonic Reservoir Computer for Speech Recognition

Dec 11, 2023
Enrico Picco, Alessandro Lupo, Serge Massar

Viaarxiv icon

Amphion: An Open-Source Audio, Music and Speech Generation Toolkit

Add code
Bookmark button
Alert button
Dec 15, 2023
Xueyao Zhang, Liumeng Xue, Yuancheng Wang, Yicheng Gu, Xi Chen, Zihao Fang, Haopeng Chen, Lexiao Zou, Chaoren Wang, Jun Han, Kai Chen, Haizhou Li, Zhizheng Wu

Figure 1 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Figure 2 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Figure 3 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Figure 4 for Amphion: An Open-Source Audio, Music and Speech Generation Toolkit
Viaarxiv icon

StyleCap: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-supervised Learning Models

Nov 28, 2023
Kazuki Yamauchi, Yusuke Ijima, Yuki Saito

Viaarxiv icon

E-chat: Emotion-sensitive Spoken Dialogue System with Large Language Models

Jan 06, 2024
Hongfei Xue, Yuhao Liang, Bingshen Mu, Shiliang Zhang, Mengzhe Chen, Qian Chen, Lei Xie

Viaarxiv icon

Unconstrained Dysfluency Modeling for Dysfluent Speech Transcription and Detection

Dec 20, 2023
Jiachen Lian, Carly Feng, Naasir Farooqi, Steve Li, Anshul Kashyap, Cheol Jun Cho, Peter Wu, Robbie Netzorg, Tingle Li, Gopala Krishna Anumanchipalli

Viaarxiv icon