Alert button

"speech": models, code, and papers
Alert button

Blind Signal Dereverberation for Machine Speech Recognition

Sep 30, 2022
Samik Sadhu, Hynek Hermansky

Figure 1 for Blind Signal Dereverberation for Machine Speech Recognition
Figure 2 for Blind Signal Dereverberation for Machine Speech Recognition
Figure 3 for Blind Signal Dereverberation for Machine Speech Recognition
Figure 4 for Blind Signal Dereverberation for Machine Speech Recognition
Viaarxiv icon

Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models

May 30, 2023
Danilo de Oliveira, Navin Raj Prabhu, Timo Gerkmann

Figure 1 for Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models
Figure 2 for Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models
Figure 3 for Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models
Figure 4 for Leveraging Semantic Information for Efficient Self-Supervised Emotion Recognition with Audio-Textual Distilled Models
Viaarxiv icon

Human-in-the-Loop Hate Speech Classification in a Multilingual Context

Add code
Bookmark button
Alert button
Dec 05, 2022
Ana Kotarcic, Dominik Hangartner, Fabrizio Gilardi, Selina Kurer, Karsten Donnay

Figure 1 for Human-in-the-Loop Hate Speech Classification in a Multilingual Context
Figure 2 for Human-in-the-Loop Hate Speech Classification in a Multilingual Context
Figure 3 for Human-in-the-Loop Hate Speech Classification in a Multilingual Context
Figure 4 for Human-in-the-Loop Hate Speech Classification in a Multilingual Context
Viaarxiv icon

Machine Unlearning: A Survey

Jun 06, 2023
Heng Xu, Tianqing Zhu, Lefeng Zhang, Wanlei Zhou, Philip S. Yu

Figure 1 for Machine Unlearning: A Survey
Figure 2 for Machine Unlearning: A Survey
Figure 3 for Machine Unlearning: A Survey
Figure 4 for Machine Unlearning: A Survey
Viaarxiv icon

High Fidelity Speech Enhancement with Band-split RNN

Add code
Bookmark button
Alert button
Dec 01, 2022
Jianwei Yu, Yi Luo, Hangting Chen, Rongzhi Gu, Chao Weng

Figure 1 for High Fidelity Speech Enhancement with Band-split RNN
Figure 2 for High Fidelity Speech Enhancement with Band-split RNN
Viaarxiv icon

Semantic Segmentation with Bidirectional Language Models Improves Long-form ASR

May 28, 2023
W. Ronny Huang, Hao Zhang, Shankar Kumar, Shuo-yiin Chang, Tara N. Sainath

Figure 1 for Semantic Segmentation with Bidirectional Language Models Improves Long-form ASR
Figure 2 for Semantic Segmentation with Bidirectional Language Models Improves Long-form ASR
Figure 3 for Semantic Segmentation with Bidirectional Language Models Improves Long-form ASR
Figure 4 for Semantic Segmentation with Bidirectional Language Models Improves Long-form ASR
Viaarxiv icon

LongFNT: Long-form Speech Recognition with Factorized Neural Transducer

Nov 17, 2022
Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian

Figure 1 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Figure 2 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Figure 3 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Figure 4 for LongFNT: Long-form Speech Recognition with Factorized Neural Transducer
Viaarxiv icon

Improving Autoregressive NLP Tasks via Modular Linearized Attention

Apr 24, 2023
Victor Agostinelli, Lizhong Chen

Figure 1 for Improving Autoregressive NLP Tasks via Modular Linearized Attention
Figure 2 for Improving Autoregressive NLP Tasks via Modular Linearized Attention
Figure 3 for Improving Autoregressive NLP Tasks via Modular Linearized Attention
Figure 4 for Improving Autoregressive NLP Tasks via Modular Linearized Attention
Viaarxiv icon

Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining

Add code
Bookmark button
Alert button
Jan 30, 2023
Takaaki Saeki, Soumi Maiti, Xinjian Li, Shinji Watanabe, Shinnosuke Takamichi, Hiroshi Saruwatari

Figure 1 for Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Figure 2 for Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Figure 3 for Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Figure 4 for Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining
Viaarxiv icon

Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring

Add code
Bookmark button
Alert button
Mar 20, 2023
Joanna Hong, Minsu Kim, Jeongsoo Choi, Yong Man Ro

Figure 1 for Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
Figure 2 for Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
Figure 3 for Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
Figure 4 for Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
Viaarxiv icon