Alert button

"speech": models, code, and papers
Alert button

Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units

Add code
Bookmark button
Alert button
Sep 25, 2023
Jakob Poncelet, Hugo Van hamme

Figure 1 for Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units
Figure 2 for Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units
Figure 3 for Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units
Figure 4 for Unsupervised Accent Adaptation Through Masked Language Model Correction Of Discrete Self-Supervised Speech Units
Viaarxiv icon

The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains

Add code
Bookmark button
Alert button
Oct 05, 2023
Erica Cooper, Wen-Chin Huang, Yu Tsao, Hsin-Min Wang, Tomoki Toda, Junichi Yamagishi

Figure 1 for The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains
Figure 2 for The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains
Figure 3 for The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains
Figure 4 for The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains
Viaarxiv icon

LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-switching ASR

Add code
Bookmark button
Alert button
Oct 07, 2023
Guodong Ma, Wenxuan Wang, Yuke Li, Yuting Yang, Binbin Du, Haoran Fu

Figure 1 for LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-switching ASR
Figure 2 for LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-switching ASR
Figure 3 for LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-switching ASR
Figure 4 for LAE-ST-MoE: Boosted Language-Aware Encoder Using Speech Translation Auxiliary Task for E2E Code-switching ASR
Viaarxiv icon

Enhancing Code-switching Speech Recognition with Interactive Language Biases

Add code
Bookmark button
Alert button
Sep 29, 2023
Hexin Liu, Leibny Paola Garcia, Xiangyu Zhang, Andy W. H. Khong, Sanjeev Khudanpur

Figure 1 for Enhancing Code-switching Speech Recognition with Interactive Language Biases
Figure 2 for Enhancing Code-switching Speech Recognition with Interactive Language Biases
Figure 3 for Enhancing Code-switching Speech Recognition with Interactive Language Biases
Figure 4 for Enhancing Code-switching Speech Recognition with Interactive Language Biases
Viaarxiv icon

Instructing Hierarchical Tasks to Robots by Verbal Commands

Add code
Bookmark button
Alert button
Nov 30, 2023
P. Telkes, A. Angleraud, R. Pieters

Viaarxiv icon

Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR

Nov 30, 2023
Jintao Jiang, Yingbo Gao, Zoltan Tuske

Figure 1 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 2 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 3 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Figure 4 for Weak Alignment Supervision from Hybrid Model Improves End-to-end ASR
Viaarxiv icon

On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Oct 12, 2023
Nick Rossenbach, Benedikt Hilmes, Ralf Schlüter

Figure 1 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Figure 2 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Figure 3 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Figure 4 for On the Relevance of Phoneme Duration Variability of Synthesized Training Data for Automatic Speech Recognition
Viaarxiv icon

DiariST: Streaming Speech Translation with Speaker Diarization

Add code
Bookmark button
Alert button
Sep 14, 2023
Mu Yang, Naoyuki Kanda, Xiaofei Wang, Junkun Chen, Peidong Wang, Jian Xue, Jinyu Li, Takuya Yoshioka

Viaarxiv icon

Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS

Add code
Bookmark button
Alert button
Sep 14, 2023
Yifan Yang, Feiyu Shen, Chenpeng Du, Ziyang Ma, Kai Yu, Daniel Povey, Xie Chen

Figure 1 for Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Figure 2 for Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Figure 3 for Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Figure 4 for Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS
Viaarxiv icon

LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models

Nov 28, 2023
Chi-Chang Lee, Hong-Wei Chen, Chu-Song Chen, Hsin-Min Wang, Tsung-Te Liu, Yu Tsao

Figure 1 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Figure 2 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Figure 3 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Figure 4 for LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models
Viaarxiv icon