Alert button

"speech": models, code, and papers
Alert button

Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation

Sep 14, 2023
Shaoshi Ling, Guoli Ye, Rui Zhao, Yifan Gong

Figure 1 for Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation
Figure 2 for Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation
Figure 3 for Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation
Figure 4 for Hybrid Attention-based Encoder-decoder Model for Efficient Language Model Adaptation
Viaarxiv icon

Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition

Jul 12, 2023
Wenxuan Wang, Guodong Ma, Yuke Li, Binbin Du

Figure 1 for Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Figure 2 for Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Figure 3 for Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Figure 4 for Language-Routing Mixture of Experts for Multilingual and Code-Switching Speech Recognition
Viaarxiv icon

Controllable Emphasis with zero data for text-to-speech

Jul 13, 2023
Arnaud Joly, Marco Nicolis, Ekaterina Peterova, Alessandro Lombardi, Ammar Abbas, Arent van Korlaar, Aman Hussain, Parul Sharma, Alexis Moinet, Mateusz Lajszczak, Penny Karanasou, Antonio Bonafonte, Thomas Drugman, Elena Sokolova

Figure 1 for Controllable Emphasis with zero data for text-to-speech
Figure 2 for Controllable Emphasis with zero data for text-to-speech
Figure 3 for Controllable Emphasis with zero data for text-to-speech
Figure 4 for Controllable Emphasis with zero data for text-to-speech
Viaarxiv icon

SpatialNet: Extensively Learning Spatial Information for Multichannel Joint Speech Separation, Denoising and Dereverberation

Add code
Bookmark button
Alert button
Jul 31, 2023
Changsheng Quan, Xiaofei Li

Figure 1 for SpatialNet: Extensively Learning Spatial Information for Multichannel Joint Speech Separation, Denoising and Dereverberation
Figure 2 for SpatialNet: Extensively Learning Spatial Information for Multichannel Joint Speech Separation, Denoising and Dereverberation
Figure 3 for SpatialNet: Extensively Learning Spatial Information for Multichannel Joint Speech Separation, Denoising and Dereverberation
Figure 4 for SpatialNet: Extensively Learning Spatial Information for Multichannel Joint Speech Separation, Denoising and Dereverberation
Viaarxiv icon

Cordyceps@LT-EDI: Patching Language-Specific Homophobia/Transphobia Classifiers with a Multilingual Understanding

Sep 24, 2023
Dean Ninalga

Figure 1 for Cordyceps@LT-EDI: Patching Language-Specific Homophobia/Transphobia Classifiers with a Multilingual Understanding
Figure 2 for Cordyceps@LT-EDI: Patching Language-Specific Homophobia/Transphobia Classifiers with a Multilingual Understanding
Figure 3 for Cordyceps@LT-EDI: Patching Language-Specific Homophobia/Transphobia Classifiers with a Multilingual Understanding
Viaarxiv icon

KIT's Multilingual Speech Translation System for IWSLT 2023

Add code
Bookmark button
Alert button
Jun 15, 2023
Danni Liu, Thai Binh Nguyen, Sai Koneru, Enes Yavuz Ugan, Ngoc-Quan Pham, Tuan-Nam Nguyen, Tu Anh Dinh, Carlos Mullov, Alexander Waibel, Jan Niehues

Figure 1 for KIT's Multilingual Speech Translation System for IWSLT 2023
Figure 2 for KIT's Multilingual Speech Translation System for IWSLT 2023
Figure 3 for KIT's Multilingual Speech Translation System for IWSLT 2023
Figure 4 for KIT's Multilingual Speech Translation System for IWSLT 2023
Viaarxiv icon

Semantic enrichment towards efficient speech representations

Jul 03, 2023
Gaëlle Laperrière, Ha Nguyen, Sahar Ghannay, Bassam Jabaian, Yannick Estève

Figure 1 for Semantic enrichment towards efficient speech representations
Figure 2 for Semantic enrichment towards efficient speech representations
Figure 3 for Semantic enrichment towards efficient speech representations
Figure 4 for Semantic enrichment towards efficient speech representations
Viaarxiv icon

VIC-KD: Variance-Invariance-Covariance Knowledge Distillation to Make Keyword Spotting More Robust Against Adversarial Attacks

Sep 22, 2023
Heitor R. Guimarães, Arthur Pimentel, Anderson Avila, Tiago H. Falk

Figure 1 for VIC-KD: Variance-Invariance-Covariance Knowledge Distillation to Make Keyword Spotting More Robust Against Adversarial Attacks
Figure 2 for VIC-KD: Variance-Invariance-Covariance Knowledge Distillation to Make Keyword Spotting More Robust Against Adversarial Attacks
Figure 3 for VIC-KD: Variance-Invariance-Covariance Knowledge Distillation to Make Keyword Spotting More Robust Against Adversarial Attacks
Figure 4 for VIC-KD: Variance-Invariance-Covariance Knowledge Distillation to Make Keyword Spotting More Robust Against Adversarial Attacks
Viaarxiv icon

Attentive Multi-Layer Perceptron for Non-autoregressive Generation

Add code
Bookmark button
Alert button
Oct 14, 2023
Shuyang Jiang, Jun Zhang, Jiangtao Feng, Lin Zheng, Lingpeng Kong

Viaarxiv icon

Generative Spoken Language Model based on continuous word-sized audio tokens

Add code
Bookmark button
Alert button
Oct 08, 2023
Robin Algayres, Yossi Adi, Tu Anh Nguyen, Jade Copet, Gabriel Synnaeve, Benoit Sagot, Emmanuel Dupoux

Viaarxiv icon