Alert button

"speech recognition": models, code, and papers
Alert button

Rationalizing Predictions by Adversarial Information Calibration

Jan 15, 2023
Lei Sha, Oana-Maria Camburu, Thomas Lukasiewicz

Figure 1 for Rationalizing Predictions by Adversarial Information Calibration
Figure 2 for Rationalizing Predictions by Adversarial Information Calibration
Figure 3 for Rationalizing Predictions by Adversarial Information Calibration
Figure 4 for Rationalizing Predictions by Adversarial Information Calibration
Viaarxiv icon

Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem

Oct 28, 2022
Sebastian P. Bayerl, Dominik Wagner, Florian Hönig, Tobias Bocklet, Elmar Nöth, Korbinian Riedhammer

Figure 1 for Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem
Figure 2 for Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem
Figure 3 for Dysfluencies Seldom Come Alone -- Detection as a Multi-Label Problem
Viaarxiv icon

Multimodal Speech Recognition with Unstructured Audio Masking

Oct 16, 2020
Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott

Figure 1 for Multimodal Speech Recognition with Unstructured Audio Masking
Figure 2 for Multimodal Speech Recognition with Unstructured Audio Masking
Figure 3 for Multimodal Speech Recognition with Unstructured Audio Masking
Figure 4 for Multimodal Speech Recognition with Unstructured Audio Masking
Viaarxiv icon

Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition

Jan 14, 2022
Mengzhe Geng, Shansong Liu, Jianwei Yu, Xurong Xie, Shoukang Hu, Zi Ye, Zengrui Jin, Xunying Liu, Helen Meng

Figure 1 for Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition
Figure 2 for Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition
Figure 3 for Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition
Figure 4 for Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition
Viaarxiv icon

Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition

Oct 12, 2021
Li-Wei Chen, Alexander Rudnicky

Figure 1 for Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
Figure 2 for Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
Figure 3 for Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
Figure 4 for Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition
Viaarxiv icon

Representation Learning to Classify and Detect Adversarial Attacks against Speaker and Speech Recognition Systems

Jul 09, 2021
Jesús Villalba, Sonal Joshi, Piotr Żelasko, Najim Dehak

Figure 1 for Representation Learning to Classify and Detect Adversarial Attacks against Speaker and Speech Recognition Systems
Figure 2 for Representation Learning to Classify and Detect Adversarial Attacks against Speaker and Speech Recognition Systems
Figure 3 for Representation Learning to Classify and Detect Adversarial Attacks against Speaker and Speech Recognition Systems
Figure 4 for Representation Learning to Classify and Detect Adversarial Attacks against Speaker and Speech Recognition Systems
Viaarxiv icon

Context-aware Fine-tuning of Self-supervised Speech Models

Dec 16, 2022
Suwon Shon, Felix Wu, Kwangyoun Kim, Prashant Sridhar, Karen Livescu, Shinji Watanabe

Figure 1 for Context-aware Fine-tuning of Self-supervised Speech Models
Figure 2 for Context-aware Fine-tuning of Self-supervised Speech Models
Figure 3 for Context-aware Fine-tuning of Self-supervised Speech Models
Figure 4 for Context-aware Fine-tuning of Self-supervised Speech Models
Viaarxiv icon

An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition

Oct 09, 2021
Xuankai Chang, Takashi Maekaku, Pengcheng Guo, Jing Shi, Yen-Ju Lu, Aswin Shanmugam Subramanian, Tianzi Wang, Shu-wen Yang, Yu Tsao, Hung-yi Lee, Shinji Watanabe

Figure 1 for An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition
Figure 2 for An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition
Figure 3 for An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition
Figure 4 for An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition
Viaarxiv icon

Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech

Oct 27, 2022
Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran

Figure 1 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 2 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 3 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Figure 4 for Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech
Viaarxiv icon

Adjust-free adversarial example generation in speech recognition using evolutionary multi-objective optimization under black-box condition

Dec 22, 2020
Shoma Ishida, Satoshi Ono

Figure 1 for Adjust-free adversarial example generation in speech recognition using evolutionary multi-objective optimization under black-box condition
Figure 2 for Adjust-free adversarial example generation in speech recognition using evolutionary multi-objective optimization under black-box condition
Figure 3 for Adjust-free adversarial example generation in speech recognition using evolutionary multi-objective optimization under black-box condition
Figure 4 for Adjust-free adversarial example generation in speech recognition using evolutionary multi-objective optimization under black-box condition
Viaarxiv icon