Alert button

"speech": models, code, and papers
Alert button

End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning

Add code
Bookmark button
Alert button
Sep 30, 2022
Navin Raj Prabhu, Nale Lehmann-Willenbrock, Timo Gerkman

Figure 1 for End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning
Figure 2 for End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning
Figure 3 for End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning
Figure 4 for End-to-End Label Uncertainty Modeling in Speech Emotion Recognition using Bayesian Neural Networks and Label Distribution Learning
Viaarxiv icon

Vocal Breath Sound Based Gender Classification

Nov 11, 2022
Mohammad Shaique Solanki, Ashutosh M Bharadwaj, Jeevan K, Prasanta Kumar Ghosh

Figure 1 for Vocal Breath Sound Based Gender Classification
Figure 2 for Vocal Breath Sound Based Gender Classification
Figure 3 for Vocal Breath Sound Based Gender Classification
Figure 4 for Vocal Breath Sound Based Gender Classification
Viaarxiv icon

Separating Long-Form Speech with Group-Wise Permutation Invariant Training

Nov 17, 2021
Wangyou Zhang, Zhuo Chen, Naoyuki Kanda, Shujie Liu, Jinyu Li, Sefik Emre Eskimez, Takuya Yoshioka, Xiong Xiao, Zhong Meng, Yanmin Qian, Furu Wei

Figure 1 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 2 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 3 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 4 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Viaarxiv icon

Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition

Feb 15, 2022
Zi-Qiang Zhang, Jie Zhang, Jian-Shu Zhang, Ming-Hui Wu, Xin Fang, Li-Rong Dai

Figure 1 for Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition
Figure 2 for Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition
Figure 3 for Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition
Figure 4 for Learning Contextually Fused Audio-visual Representations for Audio-visual Speech Recognition
Viaarxiv icon

An Overview of Indian Spoken Language Recognition from Machine Learning Perspective

Add code
Bookmark button
Alert button
Nov 30, 2022
Spandan Dey, Md Sahidullah, Goutam Saha

Figure 1 for An Overview of Indian Spoken Language Recognition from Machine Learning Perspective
Figure 2 for An Overview of Indian Spoken Language Recognition from Machine Learning Perspective
Figure 3 for An Overview of Indian Spoken Language Recognition from Machine Learning Perspective
Figure 4 for An Overview of Indian Spoken Language Recognition from Machine Learning Perspective
Viaarxiv icon

Representing `how you say' with `what you say': English corpus of focused speech and text reflecting corresponding implications

Mar 29, 2022
Naoaki Suzuki, Satoshi Nakamura

Figure 1 for Representing `how you say' with `what you say': English corpus of focused speech and text reflecting corresponding implications
Figure 2 for Representing `how you say' with `what you say': English corpus of focused speech and text reflecting corresponding implications
Figure 3 for Representing `how you say' with `what you say': English corpus of focused speech and text reflecting corresponding implications
Figure 4 for Representing `how you say' with `what you say': English corpus of focused speech and text reflecting corresponding implications
Viaarxiv icon

Speaker- and Age-Invariant Training for Child Acoustic Modeling Using Adversarial Multi-Task Learning

Oct 19, 2022
Mostafa Shahin, Beena Ahmed, Julien Epps

Figure 1 for Speaker- and Age-Invariant Training for Child Acoustic Modeling Using Adversarial Multi-Task Learning
Figure 2 for Speaker- and Age-Invariant Training for Child Acoustic Modeling Using Adversarial Multi-Task Learning
Figure 3 for Speaker- and Age-Invariant Training for Child Acoustic Modeling Using Adversarial Multi-Task Learning
Figure 4 for Speaker- and Age-Invariant Training for Child Acoustic Modeling Using Adversarial Multi-Task Learning
Viaarxiv icon

Autodecompose: A generative self-supervised model for semantic decomposition

Add code
Bookmark button
Alert button
Feb 13, 2023
Mohammad Reza Bonyadi

Figure 1 for Autodecompose: A generative self-supervised model for semantic decomposition
Figure 2 for Autodecompose: A generative self-supervised model for semantic decomposition
Figure 3 for Autodecompose: A generative self-supervised model for semantic decomposition
Figure 4 for Autodecompose: A generative self-supervised model for semantic decomposition
Viaarxiv icon

APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets

Add code
Bookmark button
Alert button
Feb 25, 2022
Kichang Yang, Wonjun Jang, Won Ik Cho

Figure 1 for APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets
Figure 2 for APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets
Figure 3 for APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets
Figure 4 for APEACH: Attacking Pejorative Expressions with Analysis on Crowd-Generated Hate Speech Evaluation Datasets
Viaarxiv icon

Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates

Add code
Bookmark button
Alert button
Aug 18, 2021
Shenhan Qian, Zhi Tu, YiHao Zhi, Wen Liu, Shenghua Gao

Figure 1 for Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates
Figure 2 for Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates
Figure 3 for Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates
Figure 4 for Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates
Viaarxiv icon