Alert button

"speech": models, code, and papers
Alert button

TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection

Add code
Bookmark button
Alert button
May 23, 2023
Chenglong Wang, Jiangyan Yi, Jianhua Tao, Chuyuan Zhang, Shuai Zhang, Ruibo Fu, Xun Chen

Figure 1 for TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection
Figure 2 for TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection
Figure 3 for TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection
Figure 4 for TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection
Viaarxiv icon

A Whisper transformer for audio captioning trained with synthetic captions and transfer learning

Add code
Bookmark button
Alert button
May 15, 2023
Marek Kadlčík, Adam Hájek, Jürgen Kieslich, Radosław Winiecki

Figure 1 for A Whisper transformer for audio captioning trained with synthetic captions and transfer learning
Figure 2 for A Whisper transformer for audio captioning trained with synthetic captions and transfer learning
Figure 3 for A Whisper transformer for audio captioning trained with synthetic captions and transfer learning
Figure 4 for A Whisper transformer for audio captioning trained with synthetic captions and transfer learning
Viaarxiv icon

Fake News and Hate Speech: Language in Common

Dec 05, 2022
Berta Chulvi, Alejandro Toselli, Paolo Rosso

Figure 1 for Fake News and Hate Speech: Language in Common
Viaarxiv icon

ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic - English

Nov 22, 2022
Injy Hamed, Nizar Habash, Slim Abdennadher, Ngoc Thang Vu

Figure 1 for ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic - English
Figure 2 for ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic - English
Figure 3 for ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic - English
Figure 4 for ArzEn-ST: A Three-way Speech Translation Corpus for Code-Switched Egyptian Arabic - English
Viaarxiv icon

AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec

Add code
Bookmark button
Alert button
May 26, 2023
Yi-Chiao Wu, Israel D. Gebru, Dejan Marković, Alexander Richard

Figure 1 for AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec
Figure 2 for AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec
Figure 3 for AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec
Figure 4 for AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec
Viaarxiv icon

ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression

May 26, 2023
Yixin Wan, Yuan Zhou, Xiulian Peng, Kai-Wei Chang, Yan Lu

Figure 1 for ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression
Figure 2 for ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression
Figure 3 for ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression
Figure 4 for ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression
Viaarxiv icon

Quran Recitation Recognition using End-to-End Deep Learning

May 10, 2023
Ahmad Al Harere, Khloud Al Jallad

Figure 1 for Quran Recitation Recognition using End-to-End Deep Learning
Figure 2 for Quran Recitation Recognition using End-to-End Deep Learning
Figure 3 for Quran Recitation Recognition using End-to-End Deep Learning
Figure 4 for Quran Recitation Recognition using End-to-End Deep Learning
Viaarxiv icon

wav2vec and its current potential to Automatic Speech Recognition in German for the usage in Digital History: A comparative assessment of available ASR-technologies for the use in cultural heritage contexts

Add code
Bookmark button
Alert button
Mar 06, 2023
Michael Fleck, Wolfgang Göderle

Viaarxiv icon

Structured State Space Decoder for Speech Recognition and Synthesis

Add code
Bookmark button
Alert button
Oct 31, 2022
Koichi Miyazaki, Masato Murata, Tomoki Koriyama

Figure 1 for Structured State Space Decoder for Speech Recognition and Synthesis
Figure 2 for Structured State Space Decoder for Speech Recognition and Synthesis
Figure 3 for Structured State Space Decoder for Speech Recognition and Synthesis
Figure 4 for Structured State Space Decoder for Speech Recognition and Synthesis
Viaarxiv icon

Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert

Add code
Bookmark button
Alert button
Mar 29, 2023
Jiadong Wang, Xinyuan Qian, Malu Zhang, Robby T. Tan, Haizhou Li

Figure 1 for Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
Figure 2 for Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
Figure 3 for Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
Figure 4 for Seeing What You Said: Talking Face Generation Guided by a Lip Reading Expert
Viaarxiv icon