Alert button

"speech": models, code, and papers
Alert button

Speaker and Language Change Detection using Wav2vec2 and Whisper

Feb 18, 2023
Tijn Berns, Nik Vaessen, David A. van Leeuwen

Figure 1 for Speaker and Language Change Detection using Wav2vec2 and Whisper
Figure 2 for Speaker and Language Change Detection using Wav2vec2 and Whisper
Figure 3 for Speaker and Language Change Detection using Wav2vec2 and Whisper
Figure 4 for Speaker and Language Change Detection using Wav2vec2 and Whisper
Viaarxiv icon

Performance Disparities Between Accents in Automatic Speech Recognition

Aug 01, 2022
Alex DiChristofano, Henry Shuster, Shefali Chandra, Neal Patwari

Figure 1 for Performance Disparities Between Accents in Automatic Speech Recognition
Figure 2 for Performance Disparities Between Accents in Automatic Speech Recognition
Figure 3 for Performance Disparities Between Accents in Automatic Speech Recognition
Figure 4 for Performance Disparities Between Accents in Automatic Speech Recognition
Viaarxiv icon

A Phoneme-Informed Neural Network Model for Note-Level Singing Transcription

Add code
Bookmark button
Alert button
Apr 12, 2023
Sangeon Yong, Li Su, Juhan Nam

Figure 1 for A Phoneme-Informed Neural Network Model for Note-Level Singing Transcription
Figure 2 for A Phoneme-Informed Neural Network Model for Note-Level Singing Transcription
Figure 3 for A Phoneme-Informed Neural Network Model for Note-Level Singing Transcription
Figure 4 for A Phoneme-Informed Neural Network Model for Note-Level Singing Transcription
Viaarxiv icon

Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech

Add code
Bookmark button
Alert button
Jul 13, 2022
Zhengxi Liu, Qiao Tian, Chenxu Hu, Xudong Liu, Menglin Wu, Yuping Wang, Hang Zhao, Yuxuan Wang

Figure 1 for Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Figure 2 for Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Figure 3 for Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Figure 4 for Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech
Viaarxiv icon

Non-Asymptotic Pointwise and Worst-Case Bounds for Classical Spectrum Estimators

Mar 21, 2023
Andrew Lamperski

Viaarxiv icon

Speech Enhancement and Dereverberation with Diffusion-based Generative Models

Add code
Bookmark button
Alert button
Aug 11, 2022
Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Timo Gerkmann

Figure 1 for Speech Enhancement and Dereverberation with Diffusion-based Generative Models
Figure 2 for Speech Enhancement and Dereverberation with Diffusion-based Generative Models
Figure 3 for Speech Enhancement and Dereverberation with Diffusion-based Generative Models
Figure 4 for Speech Enhancement and Dereverberation with Diffusion-based Generative Models
Viaarxiv icon

Monotonic segmental attention for automatic speech recognition

Add code
Bookmark button
Alert button
Oct 26, 2022
Albert Zeyer, Robin Schmitt, Wei Zhou, Ralf Schlüter, Hermann Ney

Figure 1 for Monotonic segmental attention for automatic speech recognition
Figure 2 for Monotonic segmental attention for automatic speech recognition
Figure 3 for Monotonic segmental attention for automatic speech recognition
Figure 4 for Monotonic segmental attention for automatic speech recognition
Viaarxiv icon

T5 for Hate Speech, Augmented Data and Ensemble

Oct 11, 2022
Tosin Adewumi, Sana Sabah Sabry, Nosheen Abid, Foteini Liwicki, Marcus Liwicki

Figure 1 for T5 for Hate Speech, Augmented Data and Ensemble
Figure 2 for T5 for Hate Speech, Augmented Data and Ensemble
Figure 3 for T5 for Hate Speech, Augmented Data and Ensemble
Figure 4 for T5 for Hate Speech, Augmented Data and Ensemble
Viaarxiv icon

Differentiate ChatGPT-generated and Human-written Medical Texts

Add code
Bookmark button
Alert button
Apr 23, 2023
Wenxiong Liao, Zhengliang Liu, Haixing Dai, Shaochen Xu, Zihao Wu, Yiyang Zhang, Xiaoke Huang, Dajiang Zhu, Hongmin Cai, Tianming Liu, Xiang Li

Figure 1 for Differentiate ChatGPT-generated and Human-written Medical Texts
Figure 2 for Differentiate ChatGPT-generated and Human-written Medical Texts
Figure 3 for Differentiate ChatGPT-generated and Human-written Medical Texts
Figure 4 for Differentiate ChatGPT-generated and Human-written Medical Texts
Viaarxiv icon

Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations

Add code
Bookmark button
Alert button
Apr 27, 2022
Dan Oneata, Horia Cucu

Figure 1 for Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Figure 2 for Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Figure 3 for Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Figure 4 for Improving Multimodal Speech Recognition by Data Augmentation and Speech Representations
Viaarxiv icon