Alert button

"speech": models, code, and papers
Alert button

APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets

Add code
Bookmark button
Alert button
Feb 25, 2022
Kichang Yang, Wonjun Jang, Won Ik Cho

Figure 1 for APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets
Figure 2 for APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets
Figure 3 for APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets
Figure 4 for APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets
Viaarxiv icon

Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates

Add code
Bookmark button
Alert button
Aug 18, 2021
Shenhan Qian, Zhi Tu, YiHao Zhi, Wen Liu, Shenghua Gao

Figure 1 for Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates
Figure 2 for Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates
Figure 3 for Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates
Figure 4 for Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates
Viaarxiv icon

Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction

Add code
Bookmark button
Alert button
Jan 05, 2022
Bowen Shi, Wei-Ning Hsu, Kushal Lakhotia, Abdelrahman Mohamed

Figure 1 for Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Figure 2 for Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Figure 3 for Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Figure 4 for Learning Audio-Visual Speech Representation by Masked Multimodal Cluster Prediction
Viaarxiv icon

Speech Denoising in the Waveform Domain with Self-Attention

Add code
Bookmark button
Alert button
Feb 15, 2022
Zhifeng Kong, Wei Ping, Ambrish Dantrey, Bryan Catanzaro

Figure 1 for Speech Denoising in the Waveform Domain with Self-Attention
Figure 2 for Speech Denoising in the Waveform Domain with Self-Attention
Figure 3 for Speech Denoising in the Waveform Domain with Self-Attention
Figure 4 for Speech Denoising in the Waveform Domain with Self-Attention
Viaarxiv icon

Self-supervised models of audio effectively explain human cortical responses to speech

May 27, 2022
Aditya R. Vaidya, Shailee Jain, Alexander G. Huth

Figure 1 for Self-supervised models of audio effectively explain human cortical responses to speech
Figure 2 for Self-supervised models of audio effectively explain human cortical responses to speech
Figure 3 for Self-supervised models of audio effectively explain human cortical responses to speech
Figure 4 for Self-supervised models of audio effectively explain human cortical responses to speech
Viaarxiv icon

A novel multimodal dynamic fusion network for disfluency detection in spoken utterances

Nov 27, 2022
Sreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, Manan Suri, Rajiv Ratn Shah

Figure 1 for A novel multimodal dynamic fusion network for disfluency detection in spoken utterances
Figure 2 for A novel multimodal dynamic fusion network for disfluency detection in spoken utterances
Figure 3 for A novel multimodal dynamic fusion network for disfluency detection in spoken utterances
Viaarxiv icon

Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey

Feb 22, 2022
Ngoc Dung Huynh, Mohamed Reda Bouadjenek, Imran Razzak, Kevin Lee, Chetan Arora, Ali Hassani, Arkady Zaslavsky

Figure 1 for Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey
Figure 2 for Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey
Figure 3 for Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey
Figure 4 for Adversarial Attacks on Speech Recognition Systems for Mission-Critical Applications: A Survey
Viaarxiv icon

Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech

Add code
Bookmark button
Alert button
Jun 05, 2022
Ziyue Jiang, Su Zhe, Zhou Zhao, Qian Yang, Yi Ren, Jinglin Liu, Zhenhui Ye

Figure 1 for Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
Figure 2 for Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
Figure 3 for Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
Figure 4 for Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech
Viaarxiv icon

Blind Restoration of Real-World Audio by 1D Operational GANs

Add code
Bookmark button
Alert button
Dec 30, 2022
Turker Ince, Serkan Kiranyaz, Ozer Can Devecioglu, Muhammad Salman Khan, Muhammad Chowdhury, Moncef Gabbouj

Figure 1 for Blind Restoration of Real-World Audio by 1D Operational GANs
Figure 2 for Blind Restoration of Real-World Audio by 1D Operational GANs
Figure 3 for Blind Restoration of Real-World Audio by 1D Operational GANs
Figure 4 for Blind Restoration of Real-World Audio by 1D Operational GANs
Viaarxiv icon

On Using Transformers for Speech-Separation

Add code
Bookmark button
Alert button
Feb 06, 2022
Cem Subakan, Mirco Ravanelli, Samuele Cornell, Francois Grondin, Mirko Bronzi

Figure 1 for On Using Transformers for Speech-Separation
Figure 2 for On Using Transformers for Speech-Separation
Figure 3 for On Using Transformers for Speech-Separation
Figure 4 for On Using Transformers for Speech-Separation
Viaarxiv icon