Alert button

"speech": models, code, and papers
Alert button

Personalized Speech Enhancement through Self-Supervised Data Augmentation and Purification

Apr 05, 2021
Aswin Sivaraman, Sunwoo Kim, Minje Kim

Figure 1 for Personalized Speech Enhancement through Self-Supervised Data Augmentation and Purification
Figure 2 for Personalized Speech Enhancement through Self-Supervised Data Augmentation and Purification
Figure 3 for Personalized Speech Enhancement through Self-Supervised Data Augmentation and Purification
Viaarxiv icon

Audiovisual Speech Synthesis using Tacotron2

Aug 03, 2020
Ahmed Hussen Abdelaziz, Anushree Prasanna Kumar, Chloe Seivwright, Gabriele Fanelli, Justin Binder, Yannis Stylianou, Sachin Kajarekar

Figure 1 for Audiovisual Speech Synthesis using Tacotron2
Figure 2 for Audiovisual Speech Synthesis using Tacotron2
Figure 3 for Audiovisual Speech Synthesis using Tacotron2
Figure 4 for Audiovisual Speech Synthesis using Tacotron2
Viaarxiv icon

TransMask: A Compact and Fast Speech Separation Model Based on Transformer

Add code
Bookmark button
Alert button
Feb 19, 2021
Zining Zhang, Bingsheng He, Zhenjie Zhang

Figure 1 for TransMask: A Compact and Fast Speech Separation Model Based on Transformer
Figure 2 for TransMask: A Compact and Fast Speech Separation Model Based on Transformer
Figure 3 for TransMask: A Compact and Fast Speech Separation Model Based on Transformer
Figure 4 for TransMask: A Compact and Fast Speech Separation Model Based on Transformer
Viaarxiv icon

A comparison of streaming models and data augmentation methods for robust speech recognition

Nov 19, 2021
Jiyeon Kim, Mehul Kumar, Dhananjaya Gowda, Abhinav Garg, Chanwoo Kim

Figure 1 for A comparison of streaming models and data augmentation methods for robust speech recognition
Figure 2 for A comparison of streaming models and data augmentation methods for robust speech recognition
Figure 3 for A comparison of streaming models and data augmentation methods for robust speech recognition
Figure 4 for A comparison of streaming models and data augmentation methods for robust speech recognition
Viaarxiv icon

Speaker Attentive Speech Emotion Recognition

Apr 15, 2021
Clément Le Moine, Nicolas Obin, Axel Roebel

Figure 1 for Speaker Attentive Speech Emotion Recognition
Figure 2 for Speaker Attentive Speech Emotion Recognition
Figure 3 for Speaker Attentive Speech Emotion Recognition
Figure 4 for Speaker Attentive Speech Emotion Recognition
Viaarxiv icon

AdaSpeech: Adaptive Text to Speech for Custom Voice

Add code
Bookmark button
Alert button
Mar 01, 2021
Mingjian Chen, Xu Tan, Bohan Li, Yanqing Liu, Tao Qin, Sheng Zhao, Tie-Yan Liu

Figure 1 for AdaSpeech: Adaptive Text to Speech for Custom Voice
Figure 2 for AdaSpeech: Adaptive Text to Speech for Custom Voice
Figure 3 for AdaSpeech: Adaptive Text to Speech for Custom Voice
Figure 4 for AdaSpeech: Adaptive Text to Speech for Custom Voice
Viaarxiv icon

TS-RIR: Translated synthetic room impulse responses for speech augmentation

Add code
Bookmark button
Alert button
Mar 31, 2021
Anton Ratnarajah, Zhenyu Tang, Dinesh Manocha

Figure 1 for TS-RIR: Translated synthetic room impulse responses for speech augmentation
Figure 2 for TS-RIR: Translated synthetic room impulse responses for speech augmentation
Figure 3 for TS-RIR: Translated synthetic room impulse responses for speech augmentation
Figure 4 for TS-RIR: Translated synthetic room impulse responses for speech augmentation
Viaarxiv icon

Unsupervised Cross-Lingual Speech Emotion Recognition Using Pseudo Multilabel

Add code
Bookmark button
Alert button
Aug 19, 2021
Jin Li, Nan Yan, Lan Wang

Figure 1 for Unsupervised Cross-Lingual Speech Emotion Recognition Using Pseudo Multilabel
Figure 2 for Unsupervised Cross-Lingual Speech Emotion Recognition Using Pseudo Multilabel
Figure 3 for Unsupervised Cross-Lingual Speech Emotion Recognition Using Pseudo Multilabel
Figure 4 for Unsupervised Cross-Lingual Speech Emotion Recognition Using Pseudo Multilabel
Viaarxiv icon

KeypartX: Graph-based Perception (Text) Representation

Add code
Bookmark button
Alert button
Sep 23, 2022
Peng Yang

Figure 1 for KeypartX: Graph-based Perception (Text) Representation
Figure 2 for KeypartX: Graph-based Perception (Text) Representation
Figure 3 for KeypartX: Graph-based Perception (Text) Representation
Figure 4 for KeypartX: Graph-based Perception (Text) Representation
Viaarxiv icon

W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training

Aug 07, 2021
Yu-An Chung, Yu Zhang, Wei Han, Chung-Cheng Chiu, James Qin, Ruoming Pang, Yonghui Wu

Figure 1 for W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training
Figure 2 for W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training
Figure 3 for W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training
Figure 4 for W2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training
Viaarxiv icon