Alert button

"speech": models, code, and papers
Alert button

Improving the quality of neural TTS using long-form content and multi-speaker multi-style modeling

Dec 20, 2022
Tuomo Raitio, Javier Latorre, Andrea Davis, Ladan Golipour

Figure 1 for Improving the quality of neural TTS using long-form content and multi-speaker multi-style modeling
Figure 2 for Improving the quality of neural TTS using long-form content and multi-speaker multi-style modeling
Figure 3 for Improving the quality of neural TTS using long-form content and multi-speaker multi-style modeling
Figure 4 for Improving the quality of neural TTS using long-form content and multi-speaker multi-style modeling
Viaarxiv icon

Cross-modal Contrastive Learning for Speech Translation

Add code
Bookmark button
Alert button
May 05, 2022
Rong Ye, Mingxuan Wang, Lei Li

Figure 1 for Cross-modal Contrastive Learning for Speech Translation
Figure 2 for Cross-modal Contrastive Learning for Speech Translation
Figure 3 for Cross-modal Contrastive Learning for Speech Translation
Figure 4 for Cross-modal Contrastive Learning for Speech Translation
Viaarxiv icon

Measuring Equality in Machine Learning Security Defenses

Mar 01, 2023
Luke E. Richards, Edward Raff, Cynthia Matuszek

Figure 1 for Measuring Equality in Machine Learning Security Defenses
Figure 2 for Measuring Equality in Machine Learning Security Defenses
Figure 3 for Measuring Equality in Machine Learning Security Defenses
Figure 4 for Measuring Equality in Machine Learning Security Defenses
Viaarxiv icon

UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis

Add code
Bookmark button
Alert button
Dec 06, 2022
Yi Lei, Shan Yang, Xinsheng Wang, Qicong Xie, Jixun Yao, Lei Xie, Dan Su

Figure 1 for UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis
Figure 2 for UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis
Figure 3 for UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis
Figure 4 for UniSyn: An End-to-End Unified Model for Text-to-Speech and Singing Voice Synthesis
Viaarxiv icon

Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy

Mar 14, 2023
Xulong Zhang, Haobin Tang, Jianzong Wang, Ning Cheng, Jian Luo, Jing Xiao

Figure 1 for Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy
Figure 2 for Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy
Figure 3 for Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy
Figure 4 for Dynamic Alignment Mask CTC: Improved Mask-CTC with Aligned Cross Entropy
Viaarxiv icon

Relating the fundamental frequency of speech with EEG using a dilated convolutional network

Jul 05, 2022
Corentin Puffay, Jana Van Canneyt, Jonas Vanthornhout, Hugo Van hamme, Tom Francart

Figure 1 for Relating the fundamental frequency of speech with EEG using a dilated convolutional network
Figure 2 for Relating the fundamental frequency of speech with EEG using a dilated convolutional network
Figure 3 for Relating the fundamental frequency of speech with EEG using a dilated convolutional network
Figure 4 for Relating the fundamental frequency of speech with EEG using a dilated convolutional network
Viaarxiv icon

Multi-modal deep learning system for depression and anxiety detection

Dec 30, 2022
Brian Diep, Marija Stanojevic, Jekaterina Novikova

Figure 1 for Multi-modal deep learning system for depression and anxiety detection
Figure 2 for Multi-modal deep learning system for depression and anxiety detection
Viaarxiv icon

Simultaneous Speech Extraction for Multiple Target Speakers under the Meeting Scenarios(V1)

Add code
Bookmark button
Alert button
Jun 17, 2022
Bang Zeng, Weiqing Wang, Yuanyuan Bao, Ming Li

Figure 1 for Simultaneous Speech Extraction for Multiple Target Speakers under the Meeting Scenarios(V1)
Figure 2 for Simultaneous Speech Extraction for Multiple Target Speakers under the Meeting Scenarios(V1)
Figure 3 for Simultaneous Speech Extraction for Multiple Target Speakers under the Meeting Scenarios(V1)
Figure 4 for Simultaneous Speech Extraction for Multiple Target Speakers under the Meeting Scenarios(V1)
Viaarxiv icon

AI-Based Automated Speech Therapy Tools for persons with Speech Sound Disorders: A Systematic Literature Review

Apr 21, 2022
Chinmoy Deka, Abhishek Shrivastava, Saurabh Nautiyal, Praveen Chauhan

Figure 1 for AI-Based Automated Speech Therapy Tools for persons with Speech Sound Disorders: A Systematic Literature Review
Figure 2 for AI-Based Automated Speech Therapy Tools for persons with Speech Sound Disorders: A Systematic Literature Review
Figure 3 for AI-Based Automated Speech Therapy Tools for persons with Speech Sound Disorders: A Systematic Literature Review
Figure 4 for AI-Based Automated Speech Therapy Tools for persons with Speech Sound Disorders: A Systematic Literature Review
Viaarxiv icon

Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN

Add code
Bookmark button
Alert button
Dec 12, 2021
Chia-Yu Li, Ngoc Thang Vu

Figure 1 for Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN
Figure 2 for Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN
Figure 3 for Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN
Figure 4 for Improving Speech Recognition on Noisy Speech via Speech Enhancement with Multi-Discriminators CycleGAN
Viaarxiv icon