Alert button

"speech": models, code, and papers
Alert button

ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild

Add code
Bookmark button
Alert button
Oct 05, 2022
Xuechen Liu, Xin Wang, Md Sahidullah, Jose Patino, Héctor Delgado, Tomi Kinnunen, Massimiliano Todisco, Junichi Yamagishi, Nicholas Evans, Andreas Nautsch, Kong Aik Lee

Figure 1 for ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild
Figure 2 for ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild
Figure 3 for ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild
Figure 4 for ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild
Viaarxiv icon

GazeReader: Detecting Unknown Word Using Webcam for English as a Second Language (ESL) Learners

Mar 18, 2023
Jiexin Ding, Bowen Zhao, Yuqi Huang, Yuntao Wang, Yuanchun Shi

Figure 1 for GazeReader: Detecting Unknown Word Using Webcam for English as a Second Language (ESL) Learners
Figure 2 for GazeReader: Detecting Unknown Word Using Webcam for English as a Second Language (ESL) Learners
Figure 3 for GazeReader: Detecting Unknown Word Using Webcam for English as a Second Language (ESL) Learners
Figure 4 for GazeReader: Detecting Unknown Word Using Webcam for English as a Second Language (ESL) Learners
Viaarxiv icon

Sejarah dan Perkembangan Teknik Natural Language Processing (NLP) Bahasa Indonesia: Tinjauan tentang sejarah, perkembangan teknologi, dan aplikasi NLP dalam bahasa Indonesia

Mar 28, 2023
Mukhlis Amien

Figure 1 for Sejarah dan Perkembangan Teknik Natural Language Processing (NLP) Bahasa Indonesia: Tinjauan tentang sejarah, perkembangan teknologi, dan aplikasi NLP dalam bahasa Indonesia
Figure 2 for Sejarah dan Perkembangan Teknik Natural Language Processing (NLP) Bahasa Indonesia: Tinjauan tentang sejarah, perkembangan teknologi, dan aplikasi NLP dalam bahasa Indonesia
Figure 3 for Sejarah dan Perkembangan Teknik Natural Language Processing (NLP) Bahasa Indonesia: Tinjauan tentang sejarah, perkembangan teknologi, dan aplikasi NLP dalam bahasa Indonesia
Viaarxiv icon

QSpeech: Low-Qubit Quantum Speech Application Toolkit

Add code
Bookmark button
Alert button
May 26, 2022
Zhenhou Hong, Jianzong Wang, Xiaoyang Qu, Chendong Zhao, Wei Tao, Jing Xiao

Figure 1 for QSpeech: Low-Qubit Quantum Speech Application Toolkit
Figure 2 for QSpeech: Low-Qubit Quantum Speech Application Toolkit
Figure 3 for QSpeech: Low-Qubit Quantum Speech Application Toolkit
Figure 4 for QSpeech: Low-Qubit Quantum Speech Application Toolkit
Viaarxiv icon

SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation

May 17, 2022
Sameer Khurana, Antoine Laurent, James Glass

Figure 1 for SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
Figure 2 for SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
Figure 3 for SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
Figure 4 for SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
Viaarxiv icon

Contextually-rich human affect perception using multimodal scene information

Add code
Bookmark button
Alert button
Mar 13, 2023
Digbalay Bose, Rajat Hebbar, Krishna Somandepalli, Shrikanth Narayanan

Figure 1 for Contextually-rich human affect perception using multimodal scene information
Figure 2 for Contextually-rich human affect perception using multimodal scene information
Figure 3 for Contextually-rich human affect perception using multimodal scene information
Figure 4 for Contextually-rich human affect perception using multimodal scene information
Viaarxiv icon

Fast and accurate factorized neural transducer for text adaption of end-to-end speech recognition models

Dec 05, 2022
Rui Zhao, Jian Xue, Partha Parthasarathy, Veljko Miljanic, Jinyu Li

Figure 1 for Fast and accurate factorized neural transducer for text adaption of end-to-end speech recognition models
Figure 2 for Fast and accurate factorized neural transducer for text adaption of end-to-end speech recognition models
Figure 3 for Fast and accurate factorized neural transducer for text adaption of end-to-end speech recognition models
Figure 4 for Fast and accurate factorized neural transducer for text adaption of end-to-end speech recognition models
Viaarxiv icon

Text-To-Speech Data Augmentation for Low Resource Speech Recognition

Add code
Bookmark button
Alert button
Apr 01, 2022
Rodolfo Zevallos

Figure 1 for Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Figure 2 for Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Figure 3 for Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Figure 4 for Text-To-Speech Data Augmentation for Low Resource Speech Recognition
Viaarxiv icon

Talking Head from Speech Audio using a Pre-trained Image Generator

Add code
Bookmark button
Alert button
Sep 09, 2022
Mohammed M. Alghamdi, He Wang, Andrew J. Bulpitt, David C. Hogg

Figure 1 for Talking Head from Speech Audio using a Pre-trained Image Generator
Figure 2 for Talking Head from Speech Audio using a Pre-trained Image Generator
Figure 3 for Talking Head from Speech Audio using a Pre-trained Image Generator
Figure 4 for Talking Head from Speech Audio using a Pre-trained Image Generator
Viaarxiv icon

Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training

Jun 21, 2022
Chengyi Wang, Yiming Wang, Yu Wu, Sanyuan Chen, Jinyu Li, Shujie Liu, Furu Wei

Figure 1 for Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training
Figure 2 for Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training
Viaarxiv icon