Alert button

"speech": models, code, and papers
Alert button

Non-Parametric Domain Adaptation for End-to-End Speech Translation

Add code
Bookmark button
Alert button
May 23, 2022
Yichao Du, Weizhi Wang, Zhirui Zhang, Boxing Chen, Tong Xu, Jun Xie, Enhong Chen

Figure 1 for Non-Parametric Domain Adaptation for End-to-End Speech Translation
Figure 2 for Non-Parametric Domain Adaptation for End-to-End Speech Translation
Figure 3 for Non-Parametric Domain Adaptation for End-to-End Speech Translation
Figure 4 for Non-Parametric Domain Adaptation for End-to-End Speech Translation
Viaarxiv icon

MDNet: Learning Monaural Speech Enhancement from Deep Prior Gradient

Add code
Bookmark button
Alert button
Mar 16, 2022
Andong Li, Chengshi Zheng, Ziyang Zhang, Xiaodong Li

Figure 1 for MDNet: Learning Monaural Speech Enhancement from Deep Prior Gradient
Figure 2 for MDNet: Learning Monaural Speech Enhancement from Deep Prior Gradient
Figure 3 for MDNet: Learning Monaural Speech Enhancement from Deep Prior Gradient
Figure 4 for MDNet: Learning Monaural Speech Enhancement from Deep Prior Gradient
Viaarxiv icon

Unsupervised data selection for Speech Recognition with contrastive loss ratios

Jul 25, 2022
Chanho Park, Rehan Ahmad, Thomas Hain

Figure 1 for Unsupervised data selection for Speech Recognition with contrastive loss ratios
Figure 2 for Unsupervised data selection for Speech Recognition with contrastive loss ratios
Figure 3 for Unsupervised data selection for Speech Recognition with contrastive loss ratios
Figure 4 for Unsupervised data selection for Speech Recognition with contrastive loss ratios
Viaarxiv icon

ImportantAug: a data augmentation agent for speech

Add code
Bookmark button
Alert button
Dec 14, 2021
Viet Anh Trinh, Hassan Salami Kavaki, Michael I Mandel

Figure 1 for ImportantAug: a data augmentation agent for speech
Figure 2 for ImportantAug: a data augmentation agent for speech
Figure 3 for ImportantAug: a data augmentation agent for speech
Figure 4 for ImportantAug: a data augmentation agent for speech
Viaarxiv icon

Efficient Encoders for Streaming Sequence Tagging

Jan 23, 2023
Ayush Kaushal, Aditya Gupta, Shyam Upadhyay, Manaal Faruqui

Figure 1 for Efficient Encoders for Streaming Sequence Tagging
Figure 2 for Efficient Encoders for Streaming Sequence Tagging
Figure 3 for Efficient Encoders for Streaming Sequence Tagging
Figure 4 for Efficient Encoders for Streaming Sequence Tagging
Viaarxiv icon

Enhancing Speech Recognition Decoding via Layer Aggregation

Add code
Bookmark button
Alert button
Apr 05, 2022
Tomer Wullach, Shlomo E. Chazan

Figure 1 for Enhancing Speech Recognition Decoding via Layer Aggregation
Figure 2 for Enhancing Speech Recognition Decoding via Layer Aggregation
Figure 3 for Enhancing Speech Recognition Decoding via Layer Aggregation
Figure 4 for Enhancing Speech Recognition Decoding via Layer Aggregation
Viaarxiv icon

Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling

Add code
Bookmark button
Alert button
Mar 15, 2022
Tiantian Feng, Shrikanth Narayanan

Figure 1 for Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling
Figure 2 for Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling
Figure 3 for Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling
Viaarxiv icon

Emotional Speech Recognition with Pre-trained Deep Visual Models

Add code
Bookmark button
Alert button
Apr 06, 2022
Waleed Ragheb, Mehdi Mirzapour, Ali Delfardi, Hélène Jacquenet, Lawrence Carbon

Figure 1 for Emotional Speech Recognition with Pre-trained Deep Visual Models
Figure 2 for Emotional Speech Recognition with Pre-trained Deep Visual Models
Figure 3 for Emotional Speech Recognition with Pre-trained Deep Visual Models
Figure 4 for Emotional Speech Recognition with Pre-trained Deep Visual Models
Viaarxiv icon

Improve Bilingual TTS Using Dynamic Language and Phonology Embedding

Add code
Bookmark button
Alert button
Dec 07, 2022
Fengyu Yang, Jian Luan, Yujun Wang

Figure 1 for Improve Bilingual TTS Using Dynamic Language and Phonology Embedding
Figure 2 for Improve Bilingual TTS Using Dynamic Language and Phonology Embedding
Figure 3 for Improve Bilingual TTS Using Dynamic Language and Phonology Embedding
Figure 4 for Improve Bilingual TTS Using Dynamic Language and Phonology Embedding
Viaarxiv icon

Successes and critical failures of neural networks in capturing human-like speech recognition

Apr 06, 2022
Federico Adolfi, Jeffrey S. Bowers, David Poeppel

Figure 1 for Successes and critical failures of neural networks in capturing human-like speech recognition
Figure 2 for Successes and critical failures of neural networks in capturing human-like speech recognition
Figure 3 for Successes and critical failures of neural networks in capturing human-like speech recognition
Figure 4 for Successes and critical failures of neural networks in capturing human-like speech recognition
Viaarxiv icon