Alert button

"speech": models, code, and papers
Alert button

Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis

Add code
Bookmark button
Alert button
Jun 23, 2022
Tae-Woo Kim, Min-Su Kang, Gyeong-Hoon Lee

Figure 1 for Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis
Figure 2 for Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis
Figure 3 for Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis
Figure 4 for Adversarial Multi-Task Learning for Disentangling Timbre and Pitch in Singing Voice Synthesis
Viaarxiv icon

Non-autoregressive Mandarin-English Code-switching Speech Recognition with Pinyin Mask-CTC and Word Embedding Regularization

Apr 06, 2021
Shun-Po Chuang, Heng-Jui Chang, Sung-Feng Huang, Hung-yi Lee

Figure 1 for Non-autoregressive Mandarin-English Code-switching Speech Recognition with Pinyin Mask-CTC and Word Embedding Regularization
Figure 2 for Non-autoregressive Mandarin-English Code-switching Speech Recognition with Pinyin Mask-CTC and Word Embedding Regularization
Figure 3 for Non-autoregressive Mandarin-English Code-switching Speech Recognition with Pinyin Mask-CTC and Word Embedding Regularization
Figure 4 for Non-autoregressive Mandarin-English Code-switching Speech Recognition with Pinyin Mask-CTC and Word Embedding Regularization
Viaarxiv icon

End-to-End Adversarial Text-to-Speech

Add code
Bookmark button
Alert button
Jun 05, 2020
Jeff Donahue, Sander Dieleman, Mikołaj Bińkowski, Erich Elsen, Karen Simonyan

Figure 1 for End-to-End Adversarial Text-to-Speech
Figure 2 for End-to-End Adversarial Text-to-Speech
Figure 3 for End-to-End Adversarial Text-to-Speech
Figure 4 for End-to-End Adversarial Text-to-Speech
Viaarxiv icon

Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition

Apr 07, 2022
Qijie Shao, Jinghao Yan, Jian Kang, Pengcheng Guo, Xian Shi, Pengfei Hu, Lei Xie

Figure 1 for Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition
Figure 2 for Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition
Figure 3 for Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition
Figure 4 for Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition
Viaarxiv icon

Integration of deep learning with expectation maximization for spatial cue based speech separation in reverberant conditions

Feb 26, 2021
Sania Gul, Muhammad Salman Khan, Syed Waqar Shah

Figure 1 for Integration of deep learning with expectation maximization for spatial cue based speech separation in reverberant conditions
Figure 2 for Integration of deep learning with expectation maximization for spatial cue based speech separation in reverberant conditions
Figure 3 for Integration of deep learning with expectation maximization for spatial cue based speech separation in reverberant conditions
Figure 4 for Integration of deep learning with expectation maximization for spatial cue based speech separation in reverberant conditions
Viaarxiv icon

WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network

Add code
Bookmark button
Alert button
Apr 20, 2020
Abhishek Niranjan, Mukesh Sharma, Sai Bharath Chandra Gutha, M Ali Basha Shaik

Figure 1 for WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network
Figure 2 for WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network
Figure 3 for WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network
Figure 4 for WHALETRANS: E2E WHisper to nAturaL spEech conversion using modified TRANSformer network
Viaarxiv icon

GestureLens: Visual Analysis of Gestures in Presentation Videos

Add code
Bookmark button
Alert button
Apr 23, 2022
Haipeng Zeng, Xingbo Wang, Yong Wang, Aoyu Wu, Ting Chuen Pong, Huamin Qu

Figure 1 for GestureLens: Visual Analysis of Gestures in Presentation Videos
Figure 2 for GestureLens: Visual Analysis of Gestures in Presentation Videos
Figure 3 for GestureLens: Visual Analysis of Gestures in Presentation Videos
Figure 4 for GestureLens: Visual Analysis of Gestures in Presentation Videos
Viaarxiv icon

Cross-sentence Neural Language Models for Conversational Speech Recognition

Jul 08, 2021
Shih-Hsuan Chiu, Tien-Hong Lo, Berlin Chen

Figure 1 for Cross-sentence Neural Language Models for Conversational Speech Recognition
Figure 2 for Cross-sentence Neural Language Models for Conversational Speech Recognition
Figure 3 for Cross-sentence Neural Language Models for Conversational Speech Recognition
Figure 4 for Cross-sentence Neural Language Models for Conversational Speech Recognition
Viaarxiv icon

Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition

Add code
Bookmark button
Alert button
Aug 15, 2020
Shamane Siriwardhana, Andrew Reis, Rivindu Weerasekera, Suranga Nanayakkara

Figure 1 for Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition
Figure 2 for Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition
Figure 3 for Jointly Fine-Tuning "BERT-like" Self Supervised Models to Improve Multimodal Speech Emotion Recognition
Viaarxiv icon

Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network

Apr 16, 2020
Tifani Warnita, Mariana Rodrigues Makiuchi, Nakamasa Inoue, Koichi Shinoda, Michitaka Yoshimura, Momoko Kitazawa, Kei Funaki, Yoko Eguchi, Taishiro Kishimoto

Figure 1 for Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network
Figure 2 for Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network
Figure 3 for Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network
Figure 4 for Speech Paralinguistic Approach for Detecting Dementia Using Gated Convolutional Neural Network
Viaarxiv icon