Alert button

"speech": models, code, and papers
Alert button

Towards Selection of Text-to-speech Data to Augment ASR Training

May 30, 2023
Shuo Liu, Leda Sarı, Chunyang Wu, Gil Keren, Yuan Shangguan, Jay Mahadeokar, Ozlem Kalinli

Figure 1 for Towards Selection of Text-to-speech Data to Augment ASR Training
Figure 2 for Towards Selection of Text-to-speech Data to Augment ASR Training
Figure 3 for Towards Selection of Text-to-speech Data to Augment ASR Training
Figure 4 for Towards Selection of Text-to-speech Data to Augment ASR Training
Viaarxiv icon

DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement

Add code
Bookmark button
Alert button
May 14, 2023
Hendrik Schröter, Tobias Rosenkranz, Alberto N. Escalante-B., Andreas Maier

Figure 1 for DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
Figure 2 for DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
Figure 3 for DeepFilterNet: Perceptually Motivated Real-Time Speech Enhancement
Viaarxiv icon

Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model

Sep 22, 2023
Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli

Figure 1 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 2 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 3 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Figure 4 for Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Viaarxiv icon

ML-SUPERB: Multilingual Speech Universal PERformance Benchmark

Add code
Bookmark button
Alert button
May 18, 2023
Jiatong Shi, Dan Berrebbi, William Chen, Ho-Lam Chung, En-Pei Hu, Wei Ping Huang, Xuankai Chang, Shang-Wen Li, Abdelrahman Mohamed, Hung-yi Lee, Shinji Watanabe

Figure 1 for ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Figure 2 for ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Figure 3 for ML-SUPERB: Multilingual Speech Universal PERformance Benchmark
Viaarxiv icon

ASR and Emotional Speech: A Word-Level Investigation of the Mutual Impact of Speech and Emotion Recognition

Add code
Bookmark button
Alert button
May 25, 2023
Yuanchao Li, Zeyu Zhao, Ondrej Klejch, Peter Bell, Catherine Lai

Figure 1 for ASR and Emotional Speech: A Word-Level Investigation of the Mutual Impact of Speech and Emotion Recognition
Figure 2 for ASR and Emotional Speech: A Word-Level Investigation of the Mutual Impact of Speech and Emotion Recognition
Figure 3 for ASR and Emotional Speech: A Word-Level Investigation of the Mutual Impact of Speech and Emotion Recognition
Figure 4 for ASR and Emotional Speech: A Word-Level Investigation of the Mutual Impact of Speech and Emotion Recognition
Viaarxiv icon

Streaming Speech-to-Confusion Network Speech Recognition

Add code
Bookmark button
Alert button
Jun 02, 2023
Denis Filimonov, Prabhat Pandey, Ariya Rastrow, Ankur Gandhe, Andreas Stolcke

Figure 1 for Streaming Speech-to-Confusion Network Speech Recognition
Figure 2 for Streaming Speech-to-Confusion Network Speech Recognition
Figure 3 for Streaming Speech-to-Confusion Network Speech Recognition
Figure 4 for Streaming Speech-to-Confusion Network Speech Recognition
Viaarxiv icon

SparseVSR: Lightweight and Noise Robust Visual Speech Recognition

Jul 10, 2023
Adriana Fernandez-Lopez, Honglie Chen, Pingchuan Ma, Alexandros Haliassos, Stavros Petridis, Maja Pantic

Figure 1 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Figure 2 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Figure 3 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Figure 4 for SparseVSR: Lightweight and Noise Robust Visual Speech Recognition
Viaarxiv icon

PEACE: Cross-Platform Hate Speech Detection- A Causality-guided Framework

Add code
Bookmark button
Alert button
Jun 15, 2023
Paras Sheth, Tharindu Kumarage, Raha Moraffah, Aman Chadha, Huan Liu

Figure 1 for PEACE: Cross-Platform Hate Speech Detection- A Causality-guided Framework
Figure 2 for PEACE: Cross-Platform Hate Speech Detection- A Causality-guided Framework
Figure 3 for PEACE: Cross-Platform Hate Speech Detection- A Causality-guided Framework
Figure 4 for PEACE: Cross-Platform Hate Speech Detection- A Causality-guided Framework
Viaarxiv icon

UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding

Add code
Bookmark button
Alert button
Jun 13, 2023
Chenpeng Du, Yiwei Guo, Feiyu Shen, Zhijun Liu, Zheng Liang, Xie Chen, Shuai Wang, Hui Zhang, Kai Yu

Figure 1 for UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding
Figure 2 for UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding
Figure 3 for UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding
Figure 4 for UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding
Viaarxiv icon

Enhancing Speech-to-Speech Translation with Multiple TTS Targets

Apr 10, 2023
Jiatong Shi, Yun Tang, Ann Lee, Hirofumi Inaguma, Changhan Wang, Juan Pino, Shinji Watanabe

Figure 1 for Enhancing Speech-to-Speech Translation with Multiple TTS Targets
Figure 2 for Enhancing Speech-to-Speech Translation with Multiple TTS Targets
Figure 3 for Enhancing Speech-to-Speech Translation with Multiple TTS Targets
Figure 4 for Enhancing Speech-to-Speech Translation with Multiple TTS Targets
Viaarxiv icon