Alert button

"speech": models, code, and papers
Alert button

Decoding Continuous Character-based Language from Non-invasive Brain Recordings

Add code
Bookmark button
Alert button
Mar 19, 2024
Cenyuan Zhang, Xiaoqing Zheng, Ruicheng Yin, Shujie Geng, Jianhan Xu, Xuan Gao, Changze Lv, Zixuan Ling, Xuanjing Huang, Miao Cao, Jianfeng Feng

Viaarxiv icon

EMO-SUPERB: An In-depth Look at Speech Emotion Recognition

Add code
Bookmark button
Alert button
Feb 22, 2024
Haibin Wu, Huang-Cheng Chou, Kai-Wei Chang, Lucas Goncalves, Jiawei Du, Jyh-Shing Roger Jang, Chi-Chun Lee, Hung-Yi Lee

Viaarxiv icon

Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?

Feb 19, 2024
Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli

Viaarxiv icon

Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt

Add code
Bookmark button
Alert button
Mar 18, 2024
Yongqi Wang, Ruofan Hu, Rongjie Huang, Zhiqing Hong, Ruiqi Li, Wenrui Liu, Fuming You, Tao Jin, Zhou Zhao

Figure 1 for Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt
Figure 2 for Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt
Figure 3 for Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt
Figure 4 for Prompt-Singer: Controllable Singing-Voice-Synthesis with Natural Language Prompt
Viaarxiv icon

Exploring Green AI for Audio Deepfake Detection

Add code
Bookmark button
Alert button
Mar 21, 2024
Subhajit Saha, Md Sahidullah, Swagatam Das

Figure 1 for Exploring Green AI for Audio Deepfake Detection
Figure 2 for Exploring Green AI for Audio Deepfake Detection
Figure 3 for Exploring Green AI for Audio Deepfake Detection
Figure 4 for Exploring Green AI for Audio Deepfake Detection
Viaarxiv icon

Towards Environmental Preference Based Speech Enhancement For Individualised Multi-Modal Hearing Aids

Feb 26, 2024
Jasper Kirton-Wingate, Shafique Ahmed, Adeel Hussain, Mandar Gogate, Kia Dashtipour, Jen-Cheng Hou, Tassadaq Hussain, Yu Tsao, Amir Hussain

Viaarxiv icon

Probing Self-supervised Learning Models with Target Speech Extraction

Feb 17, 2024
Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Takanori Ashihara, Shoko Araki, Jan Cernocky

Viaarxiv icon

Probing the Information Encoded in Neural-based Acoustic Models of Automatic Speech Recognition Systems

Feb 29, 2024
Quentin Raymondaud, Mickael Rouvier, Richard Dufour

Viaarxiv icon

Leveraging Linguistically Enhanced Embeddings for Open Information Extraction

Mar 20, 2024
Fauzan Farooqui, Thanmay Jayakumar, Pulkit Mathur, Mansi Radke

Figure 1 for Leveraging Linguistically Enhanced Embeddings for Open Information Extraction
Figure 2 for Leveraging Linguistically Enhanced Embeddings for Open Information Extraction
Figure 3 for Leveraging Linguistically Enhanced Embeddings for Open Information Extraction
Figure 4 for Leveraging Linguistically Enhanced Embeddings for Open Information Extraction
Viaarxiv icon

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition

Add code
Bookmark button
Alert button
Mar 07, 2024
Yusheng Dai, Hang Chen, Jun Du, Ruoyu Wang, Shihao Chen, Jiefeng Ma, Haotian Wang, Chin-Hui Lee

Figure 1 for A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Figure 2 for A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Figure 3 for A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Figure 4 for A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition
Viaarxiv icon