Alert button

"speech": models, code, and papers
Alert button

On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches

Nov 16, 2022
Guilherme Schu, Parvaneh Janbakhshi, Ina Kodrasi

Figure 1 for On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches
Figure 2 for On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches
Figure 3 for On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches
Figure 4 for On using the UA-Speech and TORGO databases to validate automatic dysarthric speech classification approaches
Viaarxiv icon

A Sidecar Separator Can Convert a Single-Speaker Speech Recognition System to a Multi-Speaker One

Feb 20, 2023
Lingwei Meng, Jiawen Kang, Mingyu Cui, Yuejiao Wang, Xixin Wu, Helen Meng

Figure 1 for A Sidecar Separator Can Convert a Single-Speaker Speech Recognition System to a Multi-Speaker One
Figure 2 for A Sidecar Separator Can Convert a Single-Speaker Speech Recognition System to a Multi-Speaker One
Figure 3 for A Sidecar Separator Can Convert a Single-Speaker Speech Recognition System to a Multi-Speaker One
Figure 4 for A Sidecar Separator Can Convert a Single-Speaker Speech Recognition System to a Multi-Speaker One
Viaarxiv icon

Relate auditory speech to EEG by shallow-deep attention-based network

Mar 20, 2023
Fan Cui, Liyong Guo, Lang He, Jiyao Liu, ErCheng Pei, Yujun Wang, Dongmei Jiang

Figure 1 for Relate auditory speech to EEG by shallow-deep attention-based network
Figure 2 for Relate auditory speech to EEG by shallow-deep attention-based network
Figure 3 for Relate auditory speech to EEG by shallow-deep attention-based network
Viaarxiv icon

MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems

Mar 10, 2023
Aminul Huq, Weiyi Zhang, Xiaolin Hu

Figure 1 for MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems
Figure 2 for MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems
Figure 3 for MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems
Figure 4 for MIXPGD: Hybrid Adversarial Training for Speech Recognition Systems
Viaarxiv icon

Using Kaldi for Automatic Speech Recognition of Conversational Austrian German

Jan 16, 2023
Julian Linke, Saskia Wepner, Gernot Kubin, Barbara Schuppler

Figure 1 for Using Kaldi for Automatic Speech Recognition of Conversational Austrian German
Figure 2 for Using Kaldi for Automatic Speech Recognition of Conversational Austrian German
Figure 3 for Using Kaldi for Automatic Speech Recognition of Conversational Austrian German
Figure 4 for Using Kaldi for Automatic Speech Recognition of Conversational Austrian German
Viaarxiv icon

ADD 2023: the Second Audio Deepfake Detection Challenge

May 23, 2023
Jiangyan Yi, Jianhua Tao, Ruibo Fu, Xinrui Yan, Chenglong Wang, Tao Wang, Chu Yuan Zhang, Xiaohui Zhang, Yan Zhao, Yong Ren, Le Xu, Junzuo Zhou, Hao Gu, Zhengqi Wen, Shan Liang, Zheng Lian, Shuai Nie, Haizhou Li

Figure 1 for ADD 2023: the Second Audio Deepfake Detection Challenge
Figure 2 for ADD 2023: the Second Audio Deepfake Detection Challenge
Figure 3 for ADD 2023: the Second Audio Deepfake Detection Challenge
Figure 4 for ADD 2023: the Second Audio Deepfake Detection Challenge
Viaarxiv icon

Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework

Jul 04, 2023
Eliya Segev, Maya Alroy, Ronen Katsir, Noam Wies, Ayana Shenhav, Yael Ben-Oren, David Zar, Oren Tadmor, Jacob Bitterman, Amnon Shashua, Tal Rosenwein

Figure 1 for Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Figure 2 for Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Figure 3 for Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Figure 4 for Align With Purpose: Optimize Desired Properties in CTC Models with a General Plug-and-Play Framework
Viaarxiv icon

MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion

Jun 16, 2023
Woo-Jin Chung, Doyeon Kim, Soo-Whan Chung, Hong-Goo Kang

Figure 1 for MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion
Figure 2 for MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion
Figure 3 for MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion
Figure 4 for MF-PAM: Accurate Pitch Estimation through Periodicity Analysis and Multi-level Feature Fusion
Viaarxiv icon

Lexical Retrieval Hypothesis in Multimodal Context

May 28, 2023
Po-Ya Angela Wang, Pin-Er Chen, Hsin-Yu Chou, Yu-Hsiang Tseng, Shu-Kai Hsieh

Figure 1 for Lexical Retrieval Hypothesis in Multimodal Context
Figure 2 for Lexical Retrieval Hypothesis in Multimodal Context
Figure 3 for Lexical Retrieval Hypothesis in Multimodal Context
Figure 4 for Lexical Retrieval Hypothesis in Multimodal Context
Viaarxiv icon

Connecting Humanities and Social Sciences: Applying Language and Speech Technology to Online Panel Surveys

Feb 21, 2023
Henk van den Heuvel, Martijn Bentum, Simone Wills, Judith C. Koops

Figure 1 for Connecting Humanities and Social Sciences: Applying Language and Speech Technology to Online Panel Surveys
Figure 2 for Connecting Humanities and Social Sciences: Applying Language and Speech Technology to Online Panel Surveys
Figure 3 for Connecting Humanities and Social Sciences: Applying Language and Speech Technology to Online Panel Surveys
Figure 4 for Connecting Humanities and Social Sciences: Applying Language and Speech Technology to Online Panel Surveys
Viaarxiv icon