Alert button

"speech": models, code, and papers
Alert button

Respiratory Distress Detection from Telephone Speech using Acoustic and Prosodic Features

Nov 15, 2020
Meemnur Rashid, Kaisar Ahmed Alman, Khaled Hasan, John H. L. Hansen, Taufiq Hasan

Figure 1 for Respiratory Distress Detection from Telephone Speech using Acoustic and Prosodic Features
Figure 2 for Respiratory Distress Detection from Telephone Speech using Acoustic and Prosodic Features
Figure 3 for Respiratory Distress Detection from Telephone Speech using Acoustic and Prosodic Features
Figure 4 for Respiratory Distress Detection from Telephone Speech using Acoustic and Prosodic Features
Viaarxiv icon

Speech Emotion Recognition Using Deep Sparse Auto-Encoder Extreme Learning Machine with a New Weighting Scheme and Spectro-Temporal Features Along with Classical Feature Selection and A New Quantum-Inspired Dimension Reduction Method

Nov 13, 2021
Fatemeh Daneshfar, Seyed Jahanshah Kabudian

Figure 1 for Speech Emotion Recognition Using Deep Sparse Auto-Encoder Extreme Learning Machine with a New Weighting Scheme and Spectro-Temporal Features Along with Classical Feature Selection and A New Quantum-Inspired Dimension Reduction Method
Figure 2 for Speech Emotion Recognition Using Deep Sparse Auto-Encoder Extreme Learning Machine with a New Weighting Scheme and Spectro-Temporal Features Along with Classical Feature Selection and A New Quantum-Inspired Dimension Reduction Method
Figure 3 for Speech Emotion Recognition Using Deep Sparse Auto-Encoder Extreme Learning Machine with a New Weighting Scheme and Spectro-Temporal Features Along with Classical Feature Selection and A New Quantum-Inspired Dimension Reduction Method
Figure 4 for Speech Emotion Recognition Using Deep Sparse Auto-Encoder Extreme Learning Machine with a New Weighting Scheme and Spectro-Temporal Features Along with Classical Feature Selection and A New Quantum-Inspired Dimension Reduction Method
Viaarxiv icon

Avoid Overfitting User Specific Information in Federated Keyword Spotting

Jun 17, 2022
Xin-Chun Li, Jin-Lin Tang, Shaoming Song, Bingshuai Li, Yinchuan Li, Yunfeng Shao, Le Gan, De-Chuan Zhan

Figure 1 for Avoid Overfitting User Specific Information in Federated Keyword Spotting
Figure 2 for Avoid Overfitting User Specific Information in Federated Keyword Spotting
Figure 3 for Avoid Overfitting User Specific Information in Federated Keyword Spotting
Figure 4 for Avoid Overfitting User Specific Information in Federated Keyword Spotting
Viaarxiv icon

Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Aug 16, 2022
Andrei Andrusenko, Rauf Nasretdinov, Aleksei Romanenko

Figure 1 for Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
Figure 2 for Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
Figure 3 for Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
Figure 4 for Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
Viaarxiv icon

Improving Multilingual Neural Machine Translation System for Indic Languages

Sep 27, 2022
Sudhansu Bala Das, Atharv Biradar, Tapas Kumar Mishra, Bidyut Kumar Patra

Figure 1 for Improving Multilingual Neural Machine Translation System for Indic Languages
Figure 2 for Improving Multilingual Neural Machine Translation System for Indic Languages
Figure 3 for Improving Multilingual Neural Machine Translation System for Indic Languages
Figure 4 for Improving Multilingual Neural Machine Translation System for Indic Languages
Viaarxiv icon

Combining Spatial Clustering with LSTM Speech Models for Multichannel Speech Enhancement

Dec 02, 2020
Felix Grezes, Zhaoheng Ni, Viet Anh Trinh, Michael Mandel

Figure 1 for Combining Spatial Clustering with LSTM Speech Models for Multichannel Speech Enhancement
Figure 2 for Combining Spatial Clustering with LSTM Speech Models for Multichannel Speech Enhancement
Figure 3 for Combining Spatial Clustering with LSTM Speech Models for Multichannel Speech Enhancement
Figure 4 for Combining Spatial Clustering with LSTM Speech Models for Multichannel Speech Enhancement
Viaarxiv icon

A review of on-device fully neural end-to-end automatic speech recognition algorithms

Dec 19, 2020
Chanwoo Kim, Dhananjaya Gowda, Dongsoo Lee, Jiyeon Kim, Ankur Kumar, Sungsoo Kim, Abhinav Garg, Changwoo Han

Figure 1 for A review of on-device fully neural end-to-end automatic speech recognition algorithms
Figure 2 for A review of on-device fully neural end-to-end automatic speech recognition algorithms
Figure 3 for A review of on-device fully neural end-to-end automatic speech recognition algorithms
Figure 4 for A review of on-device fully neural end-to-end automatic speech recognition algorithms
Viaarxiv icon

IMS-Speech: A Speech to Text Tool

Add code
Bookmark button
Alert button
Aug 13, 2019
Pavel Denisov, Ngoc Thang Vu

Figure 1 for IMS-Speech: A Speech to Text Tool
Figure 2 for IMS-Speech: A Speech to Text Tool
Figure 3 for IMS-Speech: A Speech to Text Tool
Figure 4 for IMS-Speech: A Speech to Text Tool
Viaarxiv icon

VOTE400(Voide Of The Elderly 400 Hours): A Speech Dataset to Study Voice Interface for Elderly-Care

Add code
Bookmark button
Alert button
Jan 20, 2021
Minsu Jang, Sangwon Seo, Dohyung Kim, Jaeyeon Lee, Jaehong Kim, Jun-Hwan Ahn

Figure 1 for VOTE400(Voide Of The Elderly 400 Hours): A Speech Dataset to Study Voice Interface for Elderly-Care
Figure 2 for VOTE400(Voide Of The Elderly 400 Hours): A Speech Dataset to Study Voice Interface for Elderly-Care
Figure 3 for VOTE400(Voide Of The Elderly 400 Hours): A Speech Dataset to Study Voice Interface for Elderly-Care
Figure 4 for VOTE400(Voide Of The Elderly 400 Hours): A Speech Dataset to Study Voice Interface for Elderly-Care
Viaarxiv icon

Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts

Jun 24, 2022
Atijit Anuchitanukul, Lucia Specia

Figure 1 for Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts
Figure 2 for Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts
Figure 3 for Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts
Figure 4 for Burst2Vec: An Adversarial Multi-Task Approach for Predicting Emotion, Age, and Origin from Vocal Bursts
Viaarxiv icon