Alert button

"speech": models, code, and papers
Alert button

TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection

Add code
Bookmark button
Alert button
May 23, 2023
Chenglong Wang, Jiangyan Yi, Jianhua Tao, Chuyuan Zhang, Shuai Zhang, Ruibo Fu, Xun Chen

Figure 1 for TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection
Figure 2 for TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection
Figure 3 for TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection
Figure 4 for TO-Rawnet: Improving RawNet with TCN and Orthogonal Regularization for Fake Audio Detection
Viaarxiv icon

Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts

Add code
Bookmark button
Alert button
Nov 04, 2022
Detai Xin, Sharath Adavanne, Federico Ang, Ashish Kulkarni, Shinnosuke Takamichi, Hiroshi Saruwatari

Figure 1 for Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts
Figure 2 for Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts
Figure 3 for Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts
Figure 4 for Improving Speech Prosody of Audiobook Text-to-Speech Synthesis with Acoustic and Textual Contexts
Viaarxiv icon

Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek

Add code
Bookmark button
Alert button
Dec 31, 2022
Georgios Paraskevopoulos, Theodoros Kouzelis, Georgios Rouvalis, Athanasios Katsamanis, Vassilis Katsouros, Alexandros Potamianos

Figure 1 for Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Figure 2 for Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Figure 3 for Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Figure 4 for Sample-Efficient Unsupervised Domain Adaptation of Speech Recognition Systems A case study for Modern Greek
Viaarxiv icon

AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec

Add code
Bookmark button
Alert button
May 26, 2023
Yi-Chiao Wu, Israel D. Gebru, Dejan Marković, Alexander Richard

Figure 1 for AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec
Figure 2 for AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec
Figure 3 for AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec
Figure 4 for AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec
Viaarxiv icon

ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression

May 26, 2023
Yixin Wan, Yuan Zhou, Xiulian Peng, Kai-Wei Chang, Yan Lu

Figure 1 for ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression
Figure 2 for ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression
Figure 3 for ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression
Figure 4 for ABC-KD: Attention-Based-Compression Knowledge Distillation for Deep Learning-Based Noise Suppression
Viaarxiv icon

A Whisper transformer for audio captioning trained with synthetic captions and transfer learning

Add code
Bookmark button
Alert button
May 15, 2023
Marek Kadlčík, Adam Hájek, Jürgen Kieslich, Radosław Winiecki

Figure 1 for A Whisper transformer for audio captioning trained with synthetic captions and transfer learning
Figure 2 for A Whisper transformer for audio captioning trained with synthetic captions and transfer learning
Figure 3 for A Whisper transformer for audio captioning trained with synthetic captions and transfer learning
Figure 4 for A Whisper transformer for audio captioning trained with synthetic captions and transfer learning
Viaarxiv icon

SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model

Add code
Bookmark button
Alert button
Oct 03, 2022
Yi-Jen Shih, Hsuan-Fu Wang, Heng-Jui Chang, Layne Berry, Hung-yi Lee, David Harwath

Figure 1 for SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
Figure 2 for SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
Figure 3 for SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
Figure 4 for SpeechCLIP: Integrating Speech with Pre-Trained Vision and Language Model
Viaarxiv icon

Quran Recitation Recognition using End-to-End Deep Learning

May 10, 2023
Ahmad Al Harere, Khloud Al Jallad

Figure 1 for Quran Recitation Recognition using End-to-End Deep Learning
Figure 2 for Quran Recitation Recognition using End-to-End Deep Learning
Figure 3 for Quran Recitation Recognition using End-to-End Deep Learning
Figure 4 for Quran Recitation Recognition using End-to-End Deep Learning
Viaarxiv icon

McNet: Fuse Multiple Cues for Multichannel Speech Enhancement

Add code
Bookmark button
Alert button
Nov 16, 2022
Yujie Yang, Changsheng Quan, Xiaofei Li

Figure 1 for McNet: Fuse Multiple Cues for Multichannel Speech Enhancement
Figure 2 for McNet: Fuse Multiple Cues for Multichannel Speech Enhancement
Figure 3 for McNet: Fuse Multiple Cues for Multichannel Speech Enhancement
Viaarxiv icon

HuBERT-TR: Reviving Turkish Automatic Speech Recognition with Self-supervised Speech Representation Learning

Oct 13, 2022
Ali Safaya, Engin Erzin

Figure 1 for HuBERT-TR: Reviving Turkish Automatic Speech Recognition with Self-supervised Speech Representation Learning
Figure 2 for HuBERT-TR: Reviving Turkish Automatic Speech Recognition with Self-supervised Speech Representation Learning
Figure 3 for HuBERT-TR: Reviving Turkish Automatic Speech Recognition with Self-supervised Speech Representation Learning
Figure 4 for HuBERT-TR: Reviving Turkish Automatic Speech Recognition with Self-supervised Speech Representation Learning
Viaarxiv icon