Alert button

"speech": models, code, and papers
Alert button

Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments

Jul 15, 2022
Yicheng Du, Aditya Arie Nugraha, Kouhei Sekiguchi, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii

Figure 1 for Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments
Figure 2 for Direction-Aware Joint Adaptation of Neural Speech Enhancement and Recognition in Real Multiparty Conversational Environments
Viaarxiv icon

Text-To-Speech Data Augmentation for Low Resource Speech Recognition

Add code
Bookmark button
Alert button
Apr 01, 2022
Rodolfo Zevallos

Figure 1 for Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Figure 2 for Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Figure 3 for Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Figure 4 for Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Viaarxiv icon

UniCT DMI Solution for 3rd COV19D Competition on COVID-19 Detection through attention-based CNN for CT Scan

Mar 22, 2023
Alessia Rondinella, Francesco Guarnera, Oliver Giudice, Alessandro Ortis, Francesco Rundo, Sebastiano Battiato

Figure 1 for UniCT DMI Solution for 3rd COV19D Competition on COVID-19 Detection through attention-based CNN for CT Scan
Figure 2 for UniCT DMI Solution for 3rd COV19D Competition on COVID-19 Detection through attention-based CNN for CT Scan
Viaarxiv icon

Stabilizing Transformer Training by Preventing Attention Entropy Collapse

Add code
Bookmark button
Alert button
Mar 11, 2023
Shuangfei Zhai, Tatiana Likhomanenko, Etai Littwin, Dan Busbridge, Jason Ramapuram, Yizhe Zhang, Jiatao Gu, Josh Susskind

Figure 1 for Stabilizing Transformer Training by Preventing Attention Entropy Collapse
Figure 2 for Stabilizing Transformer Training by Preventing Attention Entropy Collapse
Figure 3 for Stabilizing Transformer Training by Preventing Attention Entropy Collapse
Figure 4 for Stabilizing Transformer Training by Preventing Attention Entropy Collapse
Viaarxiv icon

Indian Sign Language Recognition Using Mediapipe Holistic

Apr 20, 2023
Dr. Velmathi G, Kaushal Goyal

Figure 1 for Indian Sign Language Recognition Using Mediapipe Holistic
Figure 2 for Indian Sign Language Recognition Using Mediapipe Holistic
Figure 3 for Indian Sign Language Recognition Using Mediapipe Holistic
Figure 4 for Indian Sign Language Recognition Using Mediapipe Holistic
Viaarxiv icon

SEM-POS: Grammatically and Semantically Correct Video Captioning

Apr 04, 2023
Asmar Nadeem, Adrian Hilton, Robert Dawes, Graham Thomas, Armin Mustafa

Figure 1 for SEM-POS: Grammatically and Semantically Correct Video Captioning
Figure 2 for SEM-POS: Grammatically and Semantically Correct Video Captioning
Figure 3 for SEM-POS: Grammatically and Semantically Correct Video Captioning
Figure 4 for SEM-POS: Grammatically and Semantically Correct Video Captioning
Viaarxiv icon

Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation

Oct 28, 2022
Nobuyuki Morioka, Heiga Zen, Nanxin Chen, Yu Zhang, Yifan Ding

Figure 1 for Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation
Figure 2 for Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation
Figure 3 for Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation
Figure 4 for Residual Adapters for Few-Shot Text-to-Speech Speaker Adaptation
Viaarxiv icon

ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild

Add code
Bookmark button
Alert button
Oct 05, 2022
Xuechen Liu, Xin Wang, Md Sahidullah, Jose Patino, Héctor Delgado, Tomi Kinnunen, Massimiliano Todisco, Junichi Yamagishi, Nicholas Evans, Andreas Nautsch, Kong Aik Lee

Figure 1 for ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild
Figure 2 for ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild
Figure 3 for ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild
Figure 4 for ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild
Viaarxiv icon

SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Dec 02, 2022
Yichong Leng, Xu Tan, Wenjie Liu, Kaitao Song, Rui Wang, Xiang-Yang Li, Tao Qin, Edward Lin, Tie-Yan Liu

Figure 1 for SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition
Figure 2 for SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition
Figure 3 for SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition
Figure 4 for SoftCorrect: Error Correction with Soft Detection for Automatic Speech Recognition
Viaarxiv icon

Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training

Jun 21, 2022
Chengyi Wang, Yiming Wang, Yu Wu, Sanyuan Chen, Jinyu Li, Shujie Liu, Furu Wei

Figure 1 for Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training
Figure 2 for Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training
Viaarxiv icon