Alert button

"speech": models, code, and papers
Alert button

Non Intrusive Intelligibility Predictor for Hearing Impaired Individuals using Self Supervised Speech Representations

Add code
Bookmark button
Alert button
Jul 27, 2023
George Close, Thomas Hain, Stefan Goetze

Figure 1 for Non Intrusive Intelligibility Predictor for Hearing Impaired Individuals using Self Supervised Speech Representations
Figure 2 for Non Intrusive Intelligibility Predictor for Hearing Impaired Individuals using Self Supervised Speech Representations
Figure 3 for Non Intrusive Intelligibility Predictor for Hearing Impaired Individuals using Self Supervised Speech Representations
Figure 4 for Non Intrusive Intelligibility Predictor for Hearing Impaired Individuals using Self Supervised Speech Representations
Viaarxiv icon

GASS: Generalizing Audio Source Separation with Large-scale Data

Add code
Bookmark button
Alert button
Sep 29, 2023
Jordi Pons, Xiaoyu Liu, Santiago Pascual, Joan Serrà

Figure 1 for GASS: Generalizing Audio Source Separation with Large-scale Data
Figure 2 for GASS: Generalizing Audio Source Separation with Large-scale Data
Figure 3 for GASS: Generalizing Audio Source Separation with Large-scale Data
Figure 4 for GASS: Generalizing Audio Source Separation with Large-scale Data
Viaarxiv icon

UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network

Oct 04, 2023
Siddhant Arora, Hayato Futami, Jee-weon Jung, Yifan Peng, Roshan Sharma, Yosuke Kashiwagi, Emiru Tsunoo, Shinji Watanabe

Figure 1 for UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network
Figure 2 for UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network
Figure 3 for UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network
Figure 4 for UniverSLU: Universal Spoken Language Understanding for Diverse Classification and Sequence Generation Tasks with a Single Network
Viaarxiv icon

Improving severity preservation of healthy-to-pathological voice conversion with global style tokens

Add code
Bookmark button
Alert button
Oct 04, 2023
Bence Mark Halpern, Wen-Chin Huang, Lester Phillip Violeta, R. J. J. H. van Son, Tomoki Toda

Figure 1 for Improving severity preservation of healthy-to-pathological voice conversion with global style tokens
Figure 2 for Improving severity preservation of healthy-to-pathological voice conversion with global style tokens
Figure 3 for Improving severity preservation of healthy-to-pathological voice conversion with global style tokens
Figure 4 for Improving severity preservation of healthy-to-pathological voice conversion with global style tokens
Viaarxiv icon

Powerset multi-class cross entropy loss for neural speaker diarization

Add code
Bookmark button
Alert button
Oct 19, 2023
Alexis Plaquet, Hervé Bredin

Viaarxiv icon

HANSEN: Human and AI Spoken Text Benchmark for Authorship Analysis

Add code
Bookmark button
Alert button
Oct 25, 2023
Nafis Irtiza Tripto, Adaku Uchendu, Thai Le, Mattia Setzu, Fosca Giannotti, Dongwon Lee

Figure 1 for HANSEN: Human and AI Spoken Text Benchmark for Authorship Analysis
Figure 2 for HANSEN: Human and AI Spoken Text Benchmark for Authorship Analysis
Figure 3 for HANSEN: Human and AI Spoken Text Benchmark for Authorship Analysis
Figure 4 for HANSEN: Human and AI Spoken Text Benchmark for Authorship Analysis
Viaarxiv icon

Intelligible Lip-to-Speech Synthesis with Speech Units

May 31, 2023
Jeongsoo Choi, Minsu Kim, Yong Man Ro

Figure 1 for Intelligible Lip-to-Speech Synthesis with Speech Units
Figure 2 for Intelligible Lip-to-Speech Synthesis with Speech Units
Figure 3 for Intelligible Lip-to-Speech Synthesis with Speech Units
Viaarxiv icon

Classification of Dysarthria based on the Levels of Severity. A Systematic Review

Oct 11, 2023
Afnan Al-Ali, Somaya Al-Maadeed, Moutaz Saleh, Rani Chinnappa Naidu, Zachariah C Alex, Prakash Ramachandran, Rajeev Khoodeeram, Rajesh Kumar M

Viaarxiv icon

Single-Channel Speech Enhancement with Deep Complex U-Networks and Probabilistic Latent Space Models

Sep 04, 2023
Eike J. Nustede, Jörn Anemüller

Figure 1 for Single-Channel Speech Enhancement with Deep Complex U-Networks and Probabilistic Latent Space Models
Figure 2 for Single-Channel Speech Enhancement with Deep Complex U-Networks and Probabilistic Latent Space Models
Figure 3 for Single-Channel Speech Enhancement with Deep Complex U-Networks and Probabilistic Latent Space Models
Figure 4 for Single-Channel Speech Enhancement with Deep Complex U-Networks and Probabilistic Latent Space Models
Viaarxiv icon

"We care": Improving Code Mixed Speech Emotion Recognition in Customer-Care Conversations

Aug 06, 2023
N V S Abhishek, Pushpak Bhattacharyya

Viaarxiv icon