Alert button

"speech recognition": models, code, and papers
Alert button

Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning

Dec 10, 2022
Chen Chen, Yuchen Hu, Qiang Zhang, Heqing Zou, Beier Zhu, Eng Siong Chng

Figure 1 for Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning
Figure 2 for Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning
Figure 3 for Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning
Figure 4 for Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning
Viaarxiv icon

Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages

Add code
Bookmark button
Alert button
Jun 07, 2023
Claytone Sikasote, Kalinda Siaminwe, Stanly Mwape, Bangiwe Zulu, Mofya Phiri, Martin Phiri, David Zulu, Mayumbo Nyirenda, Antonios Anastasopoulos

Figure 1 for Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages
Figure 2 for Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages
Figure 3 for Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages
Figure 4 for Zambezi Voice: A Multilingual Speech Corpus for Zambian Languages
Viaarxiv icon

Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models

Add code
Bookmark button
Alert button
Oct 13, 2022
Haoyu Wang, Wei-Qiang Zhang, Hongbin Suo, Yulong Wan

Figure 1 for Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models
Figure 2 for Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models
Figure 3 for Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models
Figure 4 for Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models
Viaarxiv icon

Emotions Beyond Words: Non-Speech Audio Emotion Recognition With Edge Computing

May 01, 2023
Ibrahim Malik, Siddique Latif, Sanaullah Manzoor, Muhammad Usama, Junaid Qadir, Raja Jurdak

Figure 1 for Emotions Beyond Words: Non-Speech Audio Emotion Recognition With Edge Computing
Figure 2 for Emotions Beyond Words: Non-Speech Audio Emotion Recognition With Edge Computing
Figure 3 for Emotions Beyond Words: Non-Speech Audio Emotion Recognition With Edge Computing
Figure 4 for Emotions Beyond Words: Non-Speech Audio Emotion Recognition With Edge Computing
Viaarxiv icon

PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models

Add code
Bookmark button
Alert button
Jun 08, 2023
Tiantian Feng, Shrikanth Narayanan

Figure 1 for PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models
Figure 2 for PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models
Figure 3 for PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models
Figure 4 for PEFT-SER: On the Use of Parameter Efficient Transfer Learning Approaches For Speech Emotion Recognition Using Pre-trained Speech Models
Viaarxiv icon

Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding

Jul 22, 2023
Suyoun Kim, Akshat Shrivastava, Duc Le, Ju Lin, Ozlem Kalinli, Michael L. Seltzer

Figure 1 for Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding
Figure 2 for Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding
Figure 3 for Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding
Figure 4 for Modality Confidence Aware Training for Robust End-to-End Spoken Language Understanding
Viaarxiv icon

Compensating Removed Frequency Components: Thwarting Voice Spectrum Reduction Attacks

Aug 18, 2023
Shu Wang, Kun Sun, Qi Li

Figure 1 for Compensating Removed Frequency Components: Thwarting Voice Spectrum Reduction Attacks
Figure 2 for Compensating Removed Frequency Components: Thwarting Voice Spectrum Reduction Attacks
Figure 3 for Compensating Removed Frequency Components: Thwarting Voice Spectrum Reduction Attacks
Figure 4 for Compensating Removed Frequency Components: Thwarting Voice Spectrum Reduction Attacks
Viaarxiv icon

Improving Speech Emotion Recognition Performance using Differentiable Architecture Search

May 23, 2023
Thejan Rajapakshe, Rajib Rana, Sara Khalifa, Berrak Sisman, Björn Schuller

Figure 1 for Improving Speech Emotion Recognition Performance using Differentiable Architecture Search
Figure 2 for Improving Speech Emotion Recognition Performance using Differentiable Architecture Search
Figure 3 for Improving Speech Emotion Recognition Performance using Differentiable Architecture Search
Figure 4 for Improving Speech Emotion Recognition Performance using Differentiable Architecture Search
Viaarxiv icon

An Automatic Speech Recognition System for Bengali Language based on Wav2Vec2 and Transfer Learning

Add code
Bookmark button
Alert button
Sep 20, 2022
Tushar Talukder Showrav

Figure 1 for An Automatic Speech Recognition System for Bengali Language based on Wav2Vec2 and Transfer Learning
Figure 2 for An Automatic Speech Recognition System for Bengali Language based on Wav2Vec2 and Transfer Learning
Figure 3 for An Automatic Speech Recognition System for Bengali Language based on Wav2Vec2 and Transfer Learning
Viaarxiv icon

MAC: A unified framework boosting low resource automatic speech recognition

Add code
Bookmark button
Alert button
Feb 15, 2023
Zeping Min, Qian Ge, Zhong Li, Weinan E

Figure 1 for MAC: A unified framework boosting low resource automatic speech recognition
Figure 2 for MAC: A unified framework boosting low resource automatic speech recognition
Figure 3 for MAC: A unified framework boosting low resource automatic speech recognition
Figure 4 for MAC: A unified framework boosting low resource automatic speech recognition
Viaarxiv icon