Alert button

"speech": models, code, and papers
Alert button

Multi-Label Training for Text-Independent Speaker Identification

Nov 14, 2022
Yuqi Xue

Figure 1 for Multi-Label Training for Text-Independent Speaker Identification
Figure 2 for Multi-Label Training for Text-Independent Speaker Identification
Figure 3 for Multi-Label Training for Text-Independent Speaker Identification
Figure 4 for Multi-Label Training for Text-Independent Speaker Identification
Viaarxiv icon

NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality

Add code
Bookmark button
Alert button
May 10, 2022
Xu Tan, Jiawei Chen, Haohe Liu, Jian Cong, Chen Zhang, Yanqing Liu, Xi Wang, Yichong Leng, Yuanhao Yi, Lei He, Frank Soong, Tao Qin, Sheng Zhao, Tie-Yan Liu

Figure 1 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 2 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 3 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Figure 4 for NaturalSpeech: End-to-End Text to Speech Synthesis with Human-Level Quality
Viaarxiv icon

Conversational Speech Recognition By Learning Conversation-level Characteristics

Feb 17, 2022
Kun Wei, Yike Zhang, Sining Sun, Lei Xie, Long Ma

Figure 1 for Conversational Speech Recognition By Learning Conversation-level Characteristics
Figure 2 for Conversational Speech Recognition By Learning Conversation-level Characteristics
Viaarxiv icon

Improved two-stage hate speech classification for twitter based on Deep Neural Networks

Add code
Bookmark button
Alert button
Jun 08, 2022
Georgios K. Pitsilis

Figure 1 for Improved two-stage hate speech classification for twitter based on Deep Neural Networks
Figure 2 for Improved two-stage hate speech classification for twitter based on Deep Neural Networks
Figure 3 for Improved two-stage hate speech classification for twitter based on Deep Neural Networks
Figure 4 for Improved two-stage hate speech classification for twitter based on Deep Neural Networks
Viaarxiv icon

DMF-Net: A decoupling-style multi-band fusion model for real-time full-band speech enhancement

Add code
Bookmark button
Alert button
Mar 02, 2022
Guochen Yu, Yuansheng Guan, Weixin Meng, Chengshi Zheng, Hui Wang

Figure 1 for DMF-Net: A decoupling-style multi-band fusion model for real-time full-band speech enhancement
Figure 2 for DMF-Net: A decoupling-style multi-band fusion model for real-time full-band speech enhancement
Figure 3 for DMF-Net: A decoupling-style multi-band fusion model for real-time full-band speech enhancement
Figure 4 for DMF-Net: A decoupling-style multi-band fusion model for real-time full-band speech enhancement
Viaarxiv icon

Comparison of Speech Representations for the MOS Prediction System

Add code
Bookmark button
Alert button
Jun 28, 2022
Aki Kunikoshi, Jaebok Kim, Wonsuk Jun, Kåre Sjölander

Figure 1 for Comparison of Speech Representations for the MOS Prediction System
Figure 2 for Comparison of Speech Representations for the MOS Prediction System
Figure 3 for Comparison of Speech Representations for the MOS Prediction System
Figure 4 for Comparison of Speech Representations for the MOS Prediction System
Viaarxiv icon

Integrated Speech and Gesture Synthesis

Add code
Bookmark button
Alert button
Aug 25, 2021
Siyang Wang, Simon Alexanderson, Joakim Gustafson, Jonas Beskow, Gustav Eje Henter, Éva Székely

Figure 1 for Integrated Speech and Gesture Synthesis
Figure 2 for Integrated Speech and Gesture Synthesis
Figure 3 for Integrated Speech and Gesture Synthesis
Figure 4 for Integrated Speech and Gesture Synthesis
Viaarxiv icon

Can we still use PEAQ? A Performance Analysis of the ITU Standard for the Objective Assessment of Perceived Audio Quality

Dec 02, 2022
Pablo M. Delgado, Jürgen Herre

Figure 1 for Can we still use PEAQ? A Performance Analysis of the ITU Standard for the Objective Assessment of Perceived Audio Quality
Figure 2 for Can we still use PEAQ? A Performance Analysis of the ITU Standard for the Objective Assessment of Perceived Audio Quality
Figure 3 for Can we still use PEAQ? A Performance Analysis of the ITU Standard for the Objective Assessment of Perceived Audio Quality
Figure 4 for Can we still use PEAQ? A Performance Analysis of the ITU Standard for the Objective Assessment of Perceived Audio Quality
Viaarxiv icon

Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation

Add code
Bookmark button
Alert button
Jun 20, 2022
Sara Papi, Marco Gaido, Matteo Negri, Marco Turchi

Figure 1 for Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation
Figure 2 for Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation
Figure 3 for Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation
Figure 4 for Over-Generation Cannot Be Rewarded: Length-Adaptive Average Lagging for Simultaneous Speech Translation
Viaarxiv icon

On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement

Jun 22, 2022
Kristina Tesch, Nils-Hendrik Mohrmann, Timo Gerkmann

Figure 1 for On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement
Figure 2 for On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement
Figure 3 for On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement
Figure 4 for On the Role of Spatial, Spectral, and Temporal Processing for DNN-based Non-linear Multi-channel Speech Enhancement
Viaarxiv icon