Alert button

"speech": models, code, and papers
Alert button

GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge

Add code
Bookmark button
Alert button
Oct 06, 2022
Dongkeon Park, Yechan Yu, Kyeong Wan Park, Ji Won Kim, Hong Kook Kim

Figure 1 for GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge
Figure 2 for GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge
Figure 3 for GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge
Figure 4 for GIST-AiTeR System for the Diarization Task of the 2022 VoxCeleb Speaker Recognition Challenge
Viaarxiv icon

SpeechEQ: Speech Emotion Recognition based on Multi-scale Unified Datasets and Multitask Learning

Jun 27, 2022
Zuheng Kang, Junqing Peng, Jianzong Wang, Jing Xiao

Figure 1 for SpeechEQ: Speech Emotion Recognition based on Multi-scale Unified Datasets and Multitask Learning
Figure 2 for SpeechEQ: Speech Emotion Recognition based on Multi-scale Unified Datasets and Multitask Learning
Figure 3 for SpeechEQ: Speech Emotion Recognition based on Multi-scale Unified Datasets and Multitask Learning
Figure 4 for SpeechEQ: Speech Emotion Recognition based on Multi-scale Unified Datasets and Multitask Learning
Viaarxiv icon

Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load

Add code
Bookmark button
Alert button
Mar 30, 2022
Gasser Elbanna, Alice Biryukov, Neil Scheidwasser-Clow, Lara Orlandic, Pablo Mainar, Mikolaj Kegler, Pierre Beckmann, Milos Cernak

Figure 1 for Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load
Figure 2 for Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load
Figure 3 for Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load
Figure 4 for Hybrid Handcrafted and Learnable Audio Representation for Analysis of Speech Under Cognitive and Physical Load
Viaarxiv icon

High-resolution embedding extractor for speaker diarisation

Nov 08, 2022
Hee-Soo Heo, Youngki Kwon, Bong-Jin Lee, You Jin Kim, Jee-weon Jung

Figure 1 for High-resolution embedding extractor for speaker diarisation
Figure 2 for High-resolution embedding extractor for speaker diarisation
Figure 3 for High-resolution embedding extractor for speaker diarisation
Figure 4 for High-resolution embedding extractor for speaker diarisation
Viaarxiv icon

Extracting linguistic speech patterns of Japanese fictional characters using subword units

Mar 05, 2022
Mika Kishino, Kanako Komiya

Figure 1 for Extracting linguistic speech patterns of Japanese fictional characters using subword units
Figure 2 for Extracting linguistic speech patterns of Japanese fictional characters using subword units
Figure 3 for Extracting linguistic speech patterns of Japanese fictional characters using subword units
Figure 4 for Extracting linguistic speech patterns of Japanese fictional characters using subword units
Viaarxiv icon

USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition Experiments

Add code
Bookmark button
Alert button
Jul 30, 2021
Muhammadjon Musaev, Saida Mussakhojayeva, Ilyos Khujayorov, Yerbolat Khassanov, Mannon Ochilov, Huseyin Atakan Varol

Figure 1 for USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition Experiments
Figure 2 for USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition Experiments
Figure 3 for USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition Experiments
Figure 4 for USC: An Open-Source Uzbek Speech Corpus and Initial Speech Recognition Experiments
Viaarxiv icon

Unify and Conquer: How Phonetic Feature Representation Affects Polyglot Text-To-Speech (TTS)

Add code
Bookmark button
Alert button
Jul 04, 2022
Ariadna Sanchez, Alessio Falai, Ziyao Zhang, Orazio Angelini, Kayoko Yanagisawa

Figure 1 for Unify and Conquer: How Phonetic Feature Representation Affects Polyglot Text-To-Speech (TTS)
Figure 2 for Unify and Conquer: How Phonetic Feature Representation Affects Polyglot Text-To-Speech (TTS)
Figure 3 for Unify and Conquer: How Phonetic Feature Representation Affects Polyglot Text-To-Speech (TTS)
Figure 4 for Unify and Conquer: How Phonetic Feature Representation Affects Polyglot Text-To-Speech (TTS)
Viaarxiv icon

A Novel Exploitative and Explorative GWO-SVM Algorithm for Smart Emotion Recognition

Jan 05, 2023
Xucun Yan, Zihuai Lin, Zhiyun Lin, Branka Vucetic

Figure 1 for A Novel Exploitative and Explorative GWO-SVM Algorithm for Smart Emotion Recognition
Figure 2 for A Novel Exploitative and Explorative GWO-SVM Algorithm for Smart Emotion Recognition
Figure 3 for A Novel Exploitative and Explorative GWO-SVM Algorithm for Smart Emotion Recognition
Figure 4 for A Novel Exploitative and Explorative GWO-SVM Algorithm for Smart Emotion Recognition
Viaarxiv icon

Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training

Add code
Bookmark button
Alert button
Jan 20, 2022
J. Yang, Lei He

Figure 1 for Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training
Figure 2 for Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training
Figure 3 for Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training
Figure 4 for Cross-Lingual Text-to-Speech Using Multi-Task Learning and Speaker Classifier Joint Training
Viaarxiv icon

Visual Speech Recognition for Multiple Languages in the Wild

Add code
Bookmark button
Alert button
Feb 26, 2022
Pingchuan Ma, Stavros Petridis, Maja Pantic

Figure 1 for Visual Speech Recognition for Multiple Languages in the Wild
Figure 2 for Visual Speech Recognition for Multiple Languages in the Wild
Figure 3 for Visual Speech Recognition for Multiple Languages in the Wild
Figure 4 for Visual Speech Recognition for Multiple Languages in the Wild
Viaarxiv icon