Alert button

"speech": models, code, and papers
Alert button

ARC-NLP at Multimodal Hate Speech Event Detection 2023: Multimodal Methods Boosted by Ensemble Learning, Syntactical and Entity Features

Add code
Bookmark button
Alert button
Jul 25, 2023
Umitcan Sahin, Izzet Emre Kucukkaya, Oguzhan Ozcelik, Cagri Toraman

Figure 1 for ARC-NLP at Multimodal Hate Speech Event Detection 2023: Multimodal Methods Boosted by Ensemble Learning, Syntactical and Entity Features
Figure 2 for ARC-NLP at Multimodal Hate Speech Event Detection 2023: Multimodal Methods Boosted by Ensemble Learning, Syntactical and Entity Features
Figure 3 for ARC-NLP at Multimodal Hate Speech Event Detection 2023: Multimodal Methods Boosted by Ensemble Learning, Syntactical and Entity Features
Figure 4 for ARC-NLP at Multimodal Hate Speech Event Detection 2023: Multimodal Methods Boosted by Ensemble Learning, Syntactical and Entity Features
Viaarxiv icon

An empirical study on speech restoration guided by self supervised speech representation

Add code
Bookmark button
Alert button
May 30, 2023
Jaeuk Byun, Youna Ji, Soo Whan Chung, Soyeon Choe, Min Seok Choi

Figure 1 for An empirical study on speech restoration guided by self supervised speech representation
Figure 2 for An empirical study on speech restoration guided by self supervised speech representation
Figure 3 for An empirical study on speech restoration guided by self supervised speech representation
Figure 4 for An empirical study on speech restoration guided by self supervised speech representation
Viaarxiv icon

SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs

Add code
Bookmark button
Alert button
Jul 18, 2023
Yinghao Aaron Li, Cong Han, Nima Mesgarani

Figure 1 for SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
Figure 2 for SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
Figure 3 for SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
Figure 4 for SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs
Viaarxiv icon

Noise-aware Speech Enhancement using Diffusion Probabilistic Model

Add code
Bookmark button
Alert button
Jul 16, 2023
Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng

Figure 1 for Noise-aware Speech Enhancement using Diffusion Probabilistic Model
Figure 2 for Noise-aware Speech Enhancement using Diffusion Probabilistic Model
Figure 3 for Noise-aware Speech Enhancement using Diffusion Probabilistic Model
Figure 4 for Noise-aware Speech Enhancement using Diffusion Probabilistic Model
Viaarxiv icon

SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer

Add code
Bookmark button
Alert button
Jul 20, 2023
Daegyeom Kim, Seongho Hong, Yong-Hoon Choi

Figure 1 for SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer
Figure 2 for SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer
Figure 3 for SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer
Figure 4 for SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer
Viaarxiv icon

SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces

Jul 23, 2023
Ivan Vallés-Pérez, Grzegorz Beringer, Piotr Bilinski, Gary Cook, Roberto Barra-Chicote

Figure 1 for SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Figure 2 for SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Figure 3 for SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Figure 4 for SCRAPS: Speech Contrastive Representations of Acoustic and Phonetic Spaces
Viaarxiv icon

Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration

Oct 10, 2023
Piyush Singh Pasi, Karthikeya Battepati, Preethi Jyothi, Ganesh Ramakrishnan, Tanmay Mahapatra, Manoj Singh

Figure 1 for Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration
Figure 2 for Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration
Figure 3 for Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration
Figure 4 for Temporally Aligning Long Audio Interviews with Questions: A Case Study in Multimodal Data Integration
Viaarxiv icon

Single Channel Speech Enhancement Using U-Net Spiking Neural Networks

Jul 26, 2023
Abir Riahi, Éric Plourde

Figure 1 for Single Channel Speech Enhancement Using U-Net Spiking Neural Networks
Figure 2 for Single Channel Speech Enhancement Using U-Net Spiking Neural Networks
Viaarxiv icon

ALDi: Quantifying the Arabic Level of Dialectness of Text

Add code
Bookmark button
Alert button
Oct 20, 2023
Amr Keleg, Sharon Goldwater, Walid Magdy

Viaarxiv icon

Corpus Synthesis for Zero-shot ASR domain Adaptation using Large Language Models

Sep 18, 2023
Hsuan Su, Ting-Yao Hu, Hema Swetha Koppula, Raviteja Vemulapalli, Jen-Hao Rick Chang, Karren Yang, Gautam Varma Mantena, Oncel Tuzel

Viaarxiv icon