Alert button

"speech": models, code, and papers
Alert button

Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation

Mar 25, 2022
Xue Yang, Changchun Bao

Figure 1 for Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation
Figure 2 for Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation
Figure 3 for Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation
Figure 4 for Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation
Viaarxiv icon

Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition

Jun 21, 2022
Einari Vaaras, Manu Airaksinen, Okko Räsänen

Figure 1 for Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition
Figure 2 for Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition
Figure 3 for Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition
Figure 4 for Analysis of Self-Supervised Learning and Dimensionality Reduction Methods in Clustering-Based Active Learning for Speech Emotion Recognition
Viaarxiv icon

MASS: Multi-task Anthropomorphic Speech Synthesis Framework

May 10, 2021
Jinyin Chen, Linhui Ye, Zhaoyan Ming

Figure 1 for MASS: Multi-task Anthropomorphic Speech Synthesis Framework
Figure 2 for MASS: Multi-task Anthropomorphic Speech Synthesis Framework
Figure 3 for MASS: Multi-task Anthropomorphic Speech Synthesis Framework
Figure 4 for MASS: Multi-task Anthropomorphic Speech Synthesis Framework
Viaarxiv icon

Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition

Jan 26, 2022
Piotr Żelasko, Siyuan Feng, Laureano Moro Velazquez, Ali Abavisani, Saurabhchand Bhati, Odette Scharenborg, Mark Hasegawa-Johnson, Najim Dehak

Figure 1 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Figure 2 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Figure 3 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Figure 4 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Viaarxiv icon

A study on cross-corpus speech emotion recognition and data augmentation

Jan 10, 2022
Norbert Braunschweiler, Rama Doddipatla, Simon Keizer, Svetlana Stoyanchev

Figure 1 for A study on cross-corpus speech emotion recognition and data augmentation
Figure 2 for A study on cross-corpus speech emotion recognition and data augmentation
Figure 3 for A study on cross-corpus speech emotion recognition and data augmentation
Figure 4 for A study on cross-corpus speech emotion recognition and data augmentation
Viaarxiv icon

Proficiency assessment of L2 spoken English using wav2vec 2.0

Oct 24, 2022
Stefano Bannò, Marco Matassoni

Figure 1 for Proficiency assessment of L2 spoken English using wav2vec 2.0
Figure 2 for Proficiency assessment of L2 spoken English using wav2vec 2.0
Figure 3 for Proficiency assessment of L2 spoken English using wav2vec 2.0
Figure 4 for Proficiency assessment of L2 spoken English using wav2vec 2.0
Viaarxiv icon

Phoneme-based Distribution Regularization for Speech Enhancement

Apr 08, 2021
Yajing Liu, Xiulian Peng, Zhiwei Xiong, Yan Lu

Figure 1 for Phoneme-based Distribution Regularization for Speech Enhancement
Figure 2 for Phoneme-based Distribution Regularization for Speech Enhancement
Figure 3 for Phoneme-based Distribution Regularization for Speech Enhancement
Figure 4 for Phoneme-based Distribution Regularization for Speech Enhancement
Viaarxiv icon

Symmetric Saliency-based Adversarial Attack To Speaker Identification

Oct 30, 2022
Jiadi Yao, Xing Chen, Xiao-Lei Zhang, Wei-Qiang Zhang, Kunde Yang

Figure 1 for Symmetric Saliency-based Adversarial Attack To Speaker Identification
Figure 2 for Symmetric Saliency-based Adversarial Attack To Speaker Identification
Figure 3 for Symmetric Saliency-based Adversarial Attack To Speaker Identification
Figure 4 for Symmetric Saliency-based Adversarial Attack To Speaker Identification
Viaarxiv icon

Learning to Defer to Multiple Experts: Consistent Surrogate Losses, Confidence Calibration, and Conformal Ensembles

Oct 30, 2022
Rajeev Verma, Daniel Barrejón, Eric Nalisnick

Figure 1 for Learning to Defer to Multiple Experts: Consistent Surrogate Losses, Confidence Calibration, and Conformal Ensembles
Figure 2 for Learning to Defer to Multiple Experts: Consistent Surrogate Losses, Confidence Calibration, and Conformal Ensembles
Figure 3 for Learning to Defer to Multiple Experts: Consistent Surrogate Losses, Confidence Calibration, and Conformal Ensembles
Figure 4 for Learning to Defer to Multiple Experts: Consistent Surrogate Losses, Confidence Calibration, and Conformal Ensembles
Viaarxiv icon

Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling

Oct 27, 2022
Peijie Jiang, Dingkun Long, Yanzhao Zhang, Pengjun Xie, Meishan Zhang, Min Zhang

Figure 1 for Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling
Figure 2 for Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling
Figure 3 for Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling
Figure 4 for Unsupervised Boundary-Aware Language Model Pretraining for Chinese Sequence Labeling
Viaarxiv icon