Alert button

"speech": models, code, and papers
Alert button

Enhancing Cross-lingual Transfer via Phonemic Transcription Integration

Add code
Bookmark button
Alert button
Jul 10, 2023
Hoang H. Nguyen, Chenwei Zhang, Tao Zhang, Eugene Rohrbaugh, Philip S. Yu

Figure 1 for Enhancing Cross-lingual Transfer via Phonemic Transcription Integration
Figure 2 for Enhancing Cross-lingual Transfer via Phonemic Transcription Integration
Figure 3 for Enhancing Cross-lingual Transfer via Phonemic Transcription Integration
Figure 4 for Enhancing Cross-lingual Transfer via Phonemic Transcription Integration
Viaarxiv icon

Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit

Mar 23, 2023
Haoyu Tang, Zhaoyi Liu, Chang Zeng, Xinfeng Li

Figure 1 for Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit
Figure 2 for Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit
Figure 3 for Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit
Figure 4 for Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognit
Viaarxiv icon

Language of Bargaining

Jun 12, 2023
Mourad Heddaya, Solomon Dworkin, Chenhao Tan, Rob Voigt, Alexander Zentefis

Figure 1 for Language of Bargaining
Figure 2 for Language of Bargaining
Figure 3 for Language of Bargaining
Figure 4 for Language of Bargaining
Viaarxiv icon

VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation

May 09, 2023
Yuanda Wang, Hanqing Guo, Guangjing Wang, Bocheng Chen, Qiben Yan

Figure 1 for VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation
Figure 2 for VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation
Figure 3 for VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation
Figure 4 for VSMask: Defending Against Voice Synthesis Attack via Real-Time Predictive Perturbation
Viaarxiv icon

DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder

Add code
Bookmark button
Alert button
Mar 30, 2023
Chenpng Du, Qi Chen, Tianyu He, Xu Tan, Xie Chen, Kai Yu, Sheng Zhao, Jiang Bian

Figure 1 for DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder
Figure 2 for DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder
Figure 3 for DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder
Figure 4 for DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder
Viaarxiv icon

Model Adaptation for ASR in low-resource Indian Languages

Jul 16, 2023
Abhayjeet Singh, Arjun Singh Mehta, Ashish Khuraishi K S, Deekshitha G, Gauri Date, Jai Nanavati, Jesuraja Bandekar, Karnalius Basumatary, Karthika P, Sandhya Badiger, Sathvik Udupa, Saurabh Kumar, Savitha, Prasanta Kumar Ghosh, Prashanthi V, Priyanka Pai, Raoul Nanavati, Rohan Saxena, Sai Praneeth Reddy Mora, Srinivasa Raghavan

Figure 1 for Model Adaptation for ASR in low-resource Indian Languages
Figure 2 for Model Adaptation for ASR in low-resource Indian Languages
Viaarxiv icon

EM-Network: Oracle Guided Self-distillation for Sequence Learning

Add code
Bookmark button
Alert button
Jun 14, 2023
Ji Won Yoon, Sunghwan Ahn, Hyeonseung Lee, Minchan Kim, Seok Min Kim, Nam Soo Kim

Figure 1 for EM-Network: Oracle Guided Self-distillation for Sequence Learning
Figure 2 for EM-Network: Oracle Guided Self-distillation for Sequence Learning
Figure 3 for EM-Network: Oracle Guided Self-distillation for Sequence Learning
Figure 4 for EM-Network: Oracle Guided Self-distillation for Sequence Learning
Viaarxiv icon

Self-supervised speech representation learning for keyword-spotting with light-weight transformers

Mar 07, 2023
Chenyang Gao, Yue Gu, Francesco Caliva, Yuzong Liu

Figure 1 for Self-supervised speech representation learning for keyword-spotting with light-weight transformers
Figure 2 for Self-supervised speech representation learning for keyword-spotting with light-weight transformers
Figure 3 for Self-supervised speech representation learning for keyword-spotting with light-weight transformers
Figure 4 for Self-supervised speech representation learning for keyword-spotting with light-weight transformers
Viaarxiv icon

High-Fidelity Audio Compression with Improved RVQGAN

Add code
Bookmark button
Alert button
Jun 11, 2023
Rithesh Kumar, Prem Seetharaman, Alejandro Luebs, Ishaan Kumar, Kundan Kumar

Figure 1 for High-Fidelity Audio Compression with Improved RVQGAN
Figure 2 for High-Fidelity Audio Compression with Improved RVQGAN
Figure 3 for High-Fidelity Audio Compression with Improved RVQGAN
Figure 4 for High-Fidelity Audio Compression with Improved RVQGAN
Viaarxiv icon

Using Deepfake Technologies for Word Emphasis Detection

May 12, 2023
Eran Kaufman, Lee-Ad Gottlieb

Figure 1 for Using Deepfake Technologies for Word Emphasis Detection
Figure 2 for Using Deepfake Technologies for Word Emphasis Detection
Figure 3 for Using Deepfake Technologies for Word Emphasis Detection
Figure 4 for Using Deepfake Technologies for Word Emphasis Detection
Viaarxiv icon