Alert button

"speech": models, code, and papers
Alert button

VoxBlink: X-Large Speaker Verification Dataset on Camera

Aug 23, 2023
Yuke Lin, Xiaoyi Qin, Ming Cheng, Ning Jiang, Guoqing Zhao, Ming Li

Figure 1 for VoxBlink: X-Large Speaker Verification Dataset on Camera
Figure 2 for VoxBlink: X-Large Speaker Verification Dataset on Camera
Figure 3 for VoxBlink: X-Large Speaker Verification Dataset on Camera
Figure 4 for VoxBlink: X-Large Speaker Verification Dataset on Camera
Viaarxiv icon

BAN-PL: a Novel Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service

Add code
Bookmark button
Alert button
Aug 23, 2023
Inez Okulska, Kinga Głąbińska, Anna Kołos, Agnieszka Karlińska, Emilia Wiśnios, Adam Nowakowski, Paweł Ellerik, Andrzej Prałat

Figure 1 for BAN-PL: a Novel Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service
Figure 2 for BAN-PL: a Novel Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service
Figure 3 for BAN-PL: a Novel Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service
Figure 4 for BAN-PL: a Novel Polish Dataset of Banned Harmful and Offensive Content from Wykop.pl web service
Viaarxiv icon

Self-supervised Learning with Speech Modulation Dropout

Mar 22, 2023
Samik Sadhu, Hynek Hermansky

Figure 1 for Self-supervised Learning with Speech Modulation Dropout
Figure 2 for Self-supervised Learning with Speech Modulation Dropout
Figure 3 for Self-supervised Learning with Speech Modulation Dropout
Figure 4 for Self-supervised Learning with Speech Modulation Dropout
Viaarxiv icon

TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition

Add code
Bookmark button
Alert button
May 23, 2023
Hongfei Xue, Qijie Shao, Peikun Chen, Pengcheng Guo, Lei Xie, Jie Liu

Figure 1 for TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition
Figure 2 for TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition
Figure 3 for TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition
Figure 4 for TranUSR: Phoneme-to-word Transcoder Based Unified Speech Representation Learning for Cross-lingual Speech Recognition
Viaarxiv icon

On-Device Speaker Anonymization of Acoustic Embeddings for ASR based onFlexible Location Gradient Reversal Layer

Jul 25, 2023
Md Asif Jalal, Pablo Peso Parada, Jisi Zhang, Karthikeyan Saravanan, Mete Ozay, Myoungji Han, Jung In Lee, Seokyeong Jung

Viaarxiv icon

Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations

Add code
Bookmark button
Alert button
Mar 03, 2023
Yuma Koizumi, Heiga Zen, Shigeki Karita, Yifan Ding, Kohei Yatabe, Nobuyuki Morioka, Yu Zhang, Wei Han, Ankur Bapna, Michiel Bacchiani

Figure 1 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 2 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 3 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Figure 4 for Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations
Viaarxiv icon

Evaluating quantum generative models via imbalanced data classification benchmarks

Aug 21, 2023
Graham R. Enos, Matthew J. Reagor, Eric Hulburd

Viaarxiv icon

DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning

Add code
Bookmark button
Alert button
May 17, 2023
Alexander H. Liu, Heng-Jui Chang, Michael Auli, Wei-Ning Hsu, James R. Glass

Figure 1 for DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Figure 2 for DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Figure 3 for DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Figure 4 for DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
Viaarxiv icon

Ensemble prosody prediction for expressive speech synthesis

Apr 03, 2023
Tian Huey Teh, Vivian Hu, Devang S Ram Mohan, Zack Hodari, Christopher G. R. Wallis, Tomás Gomez Ibarrondo, Alexandra Torresquintero, James Leoni, Mark Gales, Simon King

Figure 1 for Ensemble prosody prediction for expressive speech synthesis
Figure 2 for Ensemble prosody prediction for expressive speech synthesis
Figure 3 for Ensemble prosody prediction for expressive speech synthesis
Figure 4 for Ensemble prosody prediction for expressive speech synthesis
Viaarxiv icon

Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR

Add code
Bookmark button
Alert button
Apr 23, 2023
Yuchen Hu, Chen Chen, Qiushi Zhu, Eng Siong Chng

Figure 1 for Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
Figure 2 for Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
Figure 3 for Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
Figure 4 for Wav2code: Restore Clean Speech Representations via Codebook Lookup for Noise-Robust ASR
Viaarxiv icon