Alert button

"speech": models, code, and papers
Alert button

Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition

Add code
Bookmark button
Alert button
Mar 29, 2022
Junrui Ni, Liming Wang, Heting Gao, Kaizhi Qian, Yang Zhang, Shiyu Chang, Mark Hasegawa-Johnson

Figure 1 for Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition
Figure 2 for Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition
Figure 3 for Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition
Figure 4 for Unsupervised Text-to-Speech Synthesis by Unsupervised Automatic Speech Recognition
Viaarxiv icon

Automatic Speech recognition for Speech Assessment of Preschool Children

Add code
Bookmark button
Alert button
Mar 24, 2022
Amirhossein Abaskohi, Fatemeh Mortazavi, Hadi Moradi

Figure 1 for Automatic Speech recognition for Speech Assessment of Preschool Children
Figure 2 for Automatic Speech recognition for Speech Assessment of Preschool Children
Figure 3 for Automatic Speech recognition for Speech Assessment of Preschool Children
Figure 4 for Automatic Speech recognition for Speech Assessment of Preschool Children
Viaarxiv icon

Modeling Speaker-Listener Interaction for Backchannel Prediction

Apr 10, 2023
Daniel Ortega, Sarina Meyer, Antje Schweitzer, Ngoc Thang Vu

Figure 1 for Modeling Speaker-Listener Interaction for Backchannel Prediction
Figure 2 for Modeling Speaker-Listener Interaction for Backchannel Prediction
Figure 3 for Modeling Speaker-Listener Interaction for Backchannel Prediction
Figure 4 for Modeling Speaker-Listener Interaction for Backchannel Prediction
Viaarxiv icon

LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition

Dec 05, 2022
Yuguang Yang, Yu Pan, Jingjing Yin, Heng Lu

Figure 1 for LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Figure 2 for LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Figure 3 for LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Figure 4 for LMEC: Learnable Multiplicative Absolute Position Embedding Based Conformer for Speech Recognition
Viaarxiv icon

Unifying the Discrete and Continuous Emotion labels for Speech Emotion Recognition

Oct 29, 2022
Roshan Sharma, Hira Dhamyal, Bhiksha Raj, Rita Singh

Figure 1 for Unifying the Discrete and Continuous Emotion labels for Speech Emotion Recognition
Figure 2 for Unifying the Discrete and Continuous Emotion labels for Speech Emotion Recognition
Figure 3 for Unifying the Discrete and Continuous Emotion labels for Speech Emotion Recognition
Figure 4 for Unifying the Discrete and Continuous Emotion labels for Speech Emotion Recognition
Viaarxiv icon

Conversion of Acoustic Signal (Speech) Into Text By Digital Filter using Natural Language Processing

Sep 09, 2022
Abhiram Katuri, Sindhu Salugu, Gelli Tharuni, Challa Sri Gouri

Figure 1 for Conversion of Acoustic Signal (Speech) Into Text By Digital Filter using Natural Language Processing
Figure 2 for Conversion of Acoustic Signal (Speech) Into Text By Digital Filter using Natural Language Processing
Figure 3 for Conversion of Acoustic Signal (Speech) Into Text By Digital Filter using Natural Language Processing
Viaarxiv icon

Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices

Add code
Bookmark button
Alert button
Apr 04, 2022
Abner Hernandez, Paula Andrea Pérez-Toro, Juan Camilo Vásquez-Correa, Juan Rafael Orozco-Arroyave, Andreas Maier, Seung Hee Yang

Figure 1 for Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices
Figure 2 for Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices
Figure 3 for Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices
Figure 4 for Self-Supervised Speech Representations Preserve Speech Characteristics while Anonymizing Voices
Viaarxiv icon

GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation

Add code
Bookmark button
Alert button
May 01, 2023
Zhenhui Ye, Jinzheng He, Ziyue Jiang, Rongjie Huang, Jiawei Huang, Jinglin Liu, Yi Ren, Xiang Yin, Zejun Ma, Zhou Zhao

Figure 1 for GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation
Figure 2 for GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation
Figure 3 for GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation
Figure 4 for GeneFace++: Generalized and Stable Real-Time Audio-Driven 3D Talking Face Generation
Viaarxiv icon

Self-Remixing: Unsupervised Speech Separation via Separation and Remixing

Nov 18, 2022
Kohei Saijo, Tetsuji Ogawa

Figure 1 for Self-Remixing: Unsupervised Speech Separation via Separation and Remixing
Figure 2 for Self-Remixing: Unsupervised Speech Separation via Separation and Remixing
Figure 3 for Self-Remixing: Unsupervised Speech Separation via Separation and Remixing
Figure 4 for Self-Remixing: Unsupervised Speech Separation via Separation and Remixing
Viaarxiv icon

Speaker and Language Change Detection using Wav2vec2 and Whisper

Feb 18, 2023
Tijn Berns, Nik Vaessen, David A. van Leeuwen

Figure 1 for Speaker and Language Change Detection using Wav2vec2 and Whisper
Figure 2 for Speaker and Language Change Detection using Wav2vec2 and Whisper
Figure 3 for Speaker and Language Change Detection using Wav2vec2 and Whisper
Figure 4 for Speaker and Language Change Detection using Wav2vec2 and Whisper
Viaarxiv icon