Alert button

"speech": models, code, and papers
Alert button

Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition

Oct 30, 2021
Midia Yousefi, John H. L. Hanse

Figure 1 for Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition
Figure 2 for Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition
Figure 3 for Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition
Figure 4 for Speaker conditioning of acoustic models using affine transformation for multi-speaker speech recognition
Viaarxiv icon

MeWEHV: Mel and Wave Embeddings for Human Voice Tasks

Sep 28, 2022
Andrés Vasco-Carofilis, Laura Fernández-Robles, Enrique Alegre, Eduardo Fidalgo

Figure 1 for MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
Figure 2 for MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
Figure 3 for MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
Figure 4 for MeWEHV: Mel and Wave Embeddings for Human Voice Tasks
Viaarxiv icon

Recursive Estimation of User Intent from Noninvasive Electroencephalography using Discriminative Models

Oct 29, 2022
Niklas Smedemark-Margulies, Basak Celik, Tales Imbiriba, Aziz Kocanaogullari, Deniz Erdogmus

Figure 1 for Recursive Estimation of User Intent from Noninvasive Electroencephalography using Discriminative Models
Figure 2 for Recursive Estimation of User Intent from Noninvasive Electroencephalography using Discriminative Models
Figure 3 for Recursive Estimation of User Intent from Noninvasive Electroencephalography using Discriminative Models
Viaarxiv icon

Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition

Oct 07, 2021
Arya Aftab, Alireza Morsali, Shahrokh Ghaemmaghami, Benoit Champagne

Figure 1 for Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition
Figure 2 for Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition
Figure 3 for Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition
Figure 4 for Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition
Viaarxiv icon

Preserving background sound in noise-robust voice conversion via multi-task learning

Nov 06, 2022
Jixun Yao, Yi Lei, Qing Wang, Pengcheng Guo, Ziqian Ning, Lei Xie, Hai Li, Junhui Liu, Danming Xie

Figure 1 for Preserving background sound in noise-robust voice conversion via multi-task learning
Figure 2 for Preserving background sound in noise-robust voice conversion via multi-task learning
Figure 3 for Preserving background sound in noise-robust voice conversion via multi-task learning
Figure 4 for Preserving background sound in noise-robust voice conversion via multi-task learning
Viaarxiv icon

Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement

Oct 01, 2021
Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux

Figure 1 for Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement
Figure 2 for Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement
Figure 3 for Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement
Figure 4 for Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement
Viaarxiv icon

S-DCCRN: Super Wide Band DCCRN with learnable complex feature for speech enhancement

Nov 16, 2021
Shubo Lv, Yihui Fu, Mengtao Xing, Jiayao Sun, Lei Xie, Jun Huang, Yannan Wang, Tao Yu

Figure 1 for S-DCCRN: Super Wide Band DCCRN with learnable complex feature for speech enhancement
Figure 2 for S-DCCRN: Super Wide Band DCCRN with learnable complex feature for speech enhancement
Figure 3 for S-DCCRN: Super Wide Band DCCRN with learnable complex feature for speech enhancement
Figure 4 for S-DCCRN: Super Wide Band DCCRN with learnable complex feature for speech enhancement
Viaarxiv icon

HLT-NUS SUBMISSION FOR 2020 NIST Conversational Telephone Speech SRE

Nov 12, 2021
Rohan Kumar Das, Ruijie Tao, Haizhou Li

Figure 1 for HLT-NUS SUBMISSION FOR 2020 NIST Conversational Telephone Speech SRE
Figure 2 for HLT-NUS SUBMISSION FOR 2020 NIST Conversational Telephone Speech SRE
Viaarxiv icon

Controllable cross-speaker emotion transfer for end-to-end speech synthesis

Sep 14, 2021
Tao Li, Xinsheng Wang, Qicong Xie, Zhichao Wang, Lei Xie

Figure 1 for Controllable cross-speaker emotion transfer for end-to-end speech synthesis
Figure 2 for Controllable cross-speaker emotion transfer for end-to-end speech synthesis
Figure 3 for Controllable cross-speaker emotion transfer for end-to-end speech synthesis
Figure 4 for Controllable cross-speaker emotion transfer for end-to-end speech synthesis
Viaarxiv icon

Influence Functions for Sequence Tagging Models

Oct 25, 2022
Sarthak Jain, Varun Manjunatha, Byron C. Wallace, Ani Nenkova

Figure 1 for Influence Functions for Sequence Tagging Models
Figure 2 for Influence Functions for Sequence Tagging Models
Figure 3 for Influence Functions for Sequence Tagging Models
Figure 4 for Influence Functions for Sequence Tagging Models
Viaarxiv icon