Alert button

"speech": models, code, and papers
Alert button

Collaborative Learning with Artificial Intelligence Speakers (CLAIS): Pre-Service Elementary Science Teachers' Responses to the Prototype

Dec 20, 2023
Gyeong-Geon Lee, Seonyeong Mun, Myeong-Kyeong Shin, Xiaoming Zhai

Viaarxiv icon

Label Smoothing for Enhanced Text Sentiment Classification

Dec 11, 2023
Yijie Gao, Shijing Si

Figure 1 for Label Smoothing for Enhanced Text Sentiment Classification
Figure 2 for Label Smoothing for Enhanced Text Sentiment Classification
Figure 3 for Label Smoothing for Enhanced Text Sentiment Classification
Figure 4 for Label Smoothing for Enhanced Text Sentiment Classification
Viaarxiv icon

Audio Deepfake Detection with Self-Supervised WavLM and Multi-Fusion Attentive Classifier

Dec 13, 2023
Yinlin Guo, Haofan Huang, Xi Chen, He Zhao, Yuehai Wang

Figure 1 for Audio Deepfake Detection with Self-Supervised WavLM and Multi-Fusion Attentive Classifier
Figure 2 for Audio Deepfake Detection with Self-Supervised WavLM and Multi-Fusion Attentive Classifier
Figure 3 for Audio Deepfake Detection with Self-Supervised WavLM and Multi-Fusion Attentive Classifier
Figure 4 for Audio Deepfake Detection with Self-Supervised WavLM and Multi-Fusion Attentive Classifier
Viaarxiv icon

Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks

Add code
Bookmark button
Alert button
Sep 18, 2023
Soumi Maiti, Yifan Peng, Shukjae Choi, Jee-weon Jung, Xuankai Chang, Shinji Watanabe

Figure 1 for Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks
Figure 2 for Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks
Figure 3 for Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks
Figure 4 for Voxtlm: unified decoder-only models for consolidating speech recognition/synthesis and speech/text continuation tasks
Viaarxiv icon

Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function

Oct 22, 2023
Hsinyu Chang, Yicheng Hsu, Mingsian R. Bai

Figure 1 for Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function
Figure 2 for Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function
Figure 3 for Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function
Figure 4 for Deep Beamforming for Speech Enhancement and Speaker Localization with an Array Response-Aware Loss Function
Viaarxiv icon

Understanding Probe Behaviors through Variational Bounds of Mutual Information

Add code
Bookmark button
Alert button
Dec 15, 2023
Kwanghee Choi, Jee-weon Jung, Shinji Watanabe

Viaarxiv icon

Homophone Disambiguation Reveals Patterns of Context Mixing in Speech Transformers

Add code
Bookmark button
Alert button
Oct 15, 2023
Hosein Mohebbi, Grzegorz Chrupała, Willem Zuidema, Afra Alishahi

Viaarxiv icon

PerMod: Perceptually Grounded Voice Modification with Latent Diffusion Models

Dec 13, 2023
Robin Netzorg, Ajil Jalal, Luna McNulty, Gopala Krishna Anumanchipalli

Viaarxiv icon

Do self-supervised speech and language models extract similar representations as human brain?

Oct 07, 2023
Peili Chen, Linyang He, Li Fu, Lu Fan, Edward F. Chang, Yuanning Li

Viaarxiv icon

JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions

Add code
Bookmark button
Alert button
Oct 09, 2023
Detai Xin, Junfeng Jiang, Shinnosuke Takamichi, Yuki Saito, Akiko Aizawa, Hiroshi Saruwatari

Figure 1 for JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions
Figure 2 for JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions
Figure 3 for JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions
Figure 4 for JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions
Viaarxiv icon