Alert button

"speech": models, code, and papers
Alert button

Integrated Semantic and Phonetic Post-correction for Chinese Speech Recognition

Nov 16, 2021
Yi-Chang Chen, Chun-Yen Cheng, Chien-An Chen, Ming-Chieh Sung, Yi-Ren Yeh

Viaarxiv icon

Hate speech detection using static BERT embeddings

Jun 29, 2021
Gaurav Rajput, Narinder Singh punn, Sanjay Kumar Sonbhadra, Sonali Agarwal

Figure 1 for Hate speech detection using static BERT embeddings
Figure 2 for Hate speech detection using static BERT embeddings
Figure 3 for Hate speech detection using static BERT embeddings
Figure 4 for Hate speech detection using static BERT embeddings
Viaarxiv icon

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network

Dec 04, 2021
Xiaolin Hu, Kai Li, Weiyi Zhang, Yi Luo, Jean-Marie Lemercier, Timo Gerkmann

Figure 1 for Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network
Figure 2 for Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network
Figure 3 for Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network
Figure 4 for Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network
Viaarxiv icon

DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021

Oct 25, 2021
Yanqing Liu, Zhihang Xu, Gang Wang, Kuan Chen, Bohan Li, Xu Tan, Jinzhu Li, Lei He, Sheng Zhao

Figure 1 for DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021
Figure 2 for DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021
Figure 3 for DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021
Figure 4 for DelightfulTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2021
Viaarxiv icon

"Hello, It's Me": Deep Learning-based Speech Synthesis Attacks in the Real World

Sep 20, 2021
Emily Wenger, Max Bronckers, Christian Cianfarani, Jenna Cryan, Angela Sha, Haitao Zheng, Ben Y. Zhao

Figure 1 for "Hello, It's Me": Deep Learning-based Speech Synthesis Attacks in the Real World
Figure 2 for "Hello, It's Me": Deep Learning-based Speech Synthesis Attacks in the Real World
Figure 3 for "Hello, It's Me": Deep Learning-based Speech Synthesis Attacks in the Real World
Figure 4 for "Hello, It's Me": Deep Learning-based Speech Synthesis Attacks in the Real World
Viaarxiv icon

DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion

Sep 09, 2022
Ruibin Yuan, Yuxuan Wu, Jacob Li, Jaxter Kim

Figure 1 for DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion
Figure 2 for DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion
Figure 3 for DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion
Figure 4 for DeID-VC: Speaker De-identification via Zero-shot Pseudo Voice Conversion
Viaarxiv icon

Real-Time MRI Video synthesis from time aligned phonemes with sequence-to-sequence networks

Oct 30, 2022
Sathvik Udupa, Prasanta Kumar Ghosh

Figure 1 for Real-Time MRI Video synthesis from time aligned phonemes with sequence-to-sequence networks
Figure 2 for Real-Time MRI Video synthesis from time aligned phonemes with sequence-to-sequence networks
Figure 3 for Real-Time MRI Video synthesis from time aligned phonemes with sequence-to-sequence networks
Figure 4 for Real-Time MRI Video synthesis from time aligned phonemes with sequence-to-sequence networks
Viaarxiv icon

Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR

Nov 22, 2021
Ondrej Klejch, Electra Wallington, Peter Bell

Figure 1 for Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR
Figure 2 for Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR
Figure 3 for Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR
Figure 4 for Deciphering Speech: a Zero-Resource Approach to Cross-Lingual Transfer in ASR
Viaarxiv icon

NORESQA -- A Framework for Speech Quality Assessment using Non-Matching References

Sep 16, 2021
Pranay Manocha, Buye Xu, Anurag Kumar

Figure 1 for NORESQA -- A Framework for Speech Quality Assessment using Non-Matching References
Figure 2 for NORESQA -- A Framework for Speech Quality Assessment using Non-Matching References
Figure 3 for NORESQA -- A Framework for Speech Quality Assessment using Non-Matching References
Figure 4 for NORESQA -- A Framework for Speech Quality Assessment using Non-Matching References
Viaarxiv icon

BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance

Nov 13, 2022
Haotong Qin, Xudong Ma, Yifu Ding, Xiaoyang Li, Yang Zhang, Zejun Ma, Jiakai Wang, Jie Luo, Xianglong Liu

Figure 1 for BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance
Figure 2 for BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance
Figure 3 for BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance
Figure 4 for BiFSMNv2: Pushing Binary Neural Networks for Keyword Spotting to Real-Network Performance
Viaarxiv icon