Alert button

"speech": models, code, and papers
Alert button

MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-scale Style Modeling

Add code
Bookmark button
Alert button
Sep 03, 2023
Zhichao Wang, Xinsheng Wang, Qicong Xie, Tao Li, Lei Xie, Qiao Tian, Yuping Wang

Figure 1 for MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-scale Style Modeling
Figure 2 for MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-scale Style Modeling
Figure 3 for MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-scale Style Modeling
Figure 4 for MSM-VC: High-fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-scale Style Modeling
Viaarxiv icon

All-for-One and One-For-All: Deep learning-based feature fusion for Synthetic Speech Detection

Jul 28, 2023
Daniele Mari, Davide Salvi, Paolo Bestagini, Simone Milani

Viaarxiv icon

Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques

Aug 04, 2023
Samiul Islam, Md. Maksudul Haque, Abu Jobayer Md. Sadat

Figure 1 for Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques
Figure 2 for Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques
Figure 3 for Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques
Figure 4 for Capturing Spectral and Long-term Contextual Information for Speech Emotion Recognition Using Deep Learning Techniques
Viaarxiv icon

RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations

Add code
Bookmark button
Alert button
Jul 03, 2023
Neha Sahipjohn, Neil Shah, Vishal Tambrahalli, Vineet Gandhi

Figure 1 for RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations
Figure 2 for RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations
Figure 3 for RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations
Figure 4 for RobustL2S: Speaker-Specific Lip-to-Speech Synthesis exploiting Self-Supervised Representations
Viaarxiv icon

SeqXGPT: Sentence-Level AI-Generated Text Detection

Add code
Bookmark button
Alert button
Oct 13, 2023
Pengyu Wang, Linyang Li, Ke Ren, Botian Jiang, Dong Zhang, Xipeng Qiu

Figure 1 for SeqXGPT: Sentence-Level AI-Generated Text Detection
Figure 2 for SeqXGPT: Sentence-Level AI-Generated Text Detection
Figure 3 for SeqXGPT: Sentence-Level AI-Generated Text Detection
Figure 4 for SeqXGPT: Sentence-Level AI-Generated Text Detection
Viaarxiv icon

Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks

Sep 18, 2023
Zeyang Song, Jibin Wu, Malu Zhang, Mike Zheng Shou, Haizhou Li

Viaarxiv icon

Topic Identification For Spontaneous Speech: Enriching Audio Features With Embedded Linguistic Information

Add code
Bookmark button
Alert button
Jul 21, 2023
Dejan Porjazovski, Tamás Grósz, Mikko Kurimo

Viaarxiv icon

PromptTTS 2: Describing and Generating Voices with Text Prompt

Add code
Bookmark button
Alert button
Sep 05, 2023
Yichong Leng, Zhifang Guo, Kai Shen, Xu Tan, Zeqian Ju, Yanqing Liu, Yufei Liu, Dongchao Yang, Leying Zhang, Kaitao Song, Lei He, Xiang-Yang Li, Sheng Zhao, Tao Qin, Jiang Bian

Figure 1 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Figure 2 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Figure 3 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Figure 4 for PromptTTS 2: Describing and Generating Voices with Text Prompt
Viaarxiv icon

UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data

Add code
Bookmark button
Alert button
Jun 28, 2023
Heeseung Kim, Sungwon Kim, Jiheum Yeom, Sungroh Yoon

Figure 1 for UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data
Figure 2 for UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data
Figure 3 for UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data
Figure 4 for UnitSpeech: Speaker-adaptive Speech Synthesis with Untranscribed Data
Viaarxiv icon

EmoSpeech: Guiding FastSpeech2 Towards Emotional Text to Speech

Add code
Bookmark button
Alert button
Jun 28, 2023
Daria Diatlova, Vitaly Shutov

Viaarxiv icon