Alert button

"speech": models, code, and papers
Alert button

FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning

Jul 01, 2022
Yeonghyeon Lee, Kangwook Jang, Jahyun Goo, Youngmoon Jung, Hoirin Kim

Figure 1 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Figure 2 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Figure 3 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Figure 4 for FitHuBERT: Going Thinner and Deeper for Knowledge Distillation of Speech Self-Supervised Learning
Viaarxiv icon

Unsupervised domain adaptation for speech recognition with unsupervised error correction

Sep 24, 2022
Long Mai, Julie Carson-Berndsen

Figure 1 for Unsupervised domain adaptation for speech recognition with unsupervised error correction
Figure 2 for Unsupervised domain adaptation for speech recognition with unsupervised error correction
Figure 3 for Unsupervised domain adaptation for speech recognition with unsupervised error correction
Figure 4 for Unsupervised domain adaptation for speech recognition with unsupervised error correction
Viaarxiv icon

Topic Modeling Based on Two-Step Flow Theory: Application to Tweets about Bitcoin

Mar 03, 2023
Aos Mulahuwaish, Matthew Loucks, Basheer Qolomany, Ala Al-Fuqaha

Figure 1 for Topic Modeling Based on Two-Step Flow Theory: Application to Tweets about Bitcoin
Figure 2 for Topic Modeling Based on Two-Step Flow Theory: Application to Tweets about Bitcoin
Figure 3 for Topic Modeling Based on Two-Step Flow Theory: Application to Tweets about Bitcoin
Figure 4 for Topic Modeling Based on Two-Step Flow Theory: Application to Tweets about Bitcoin
Viaarxiv icon

WavThruVec: Latent speech representation as intermediate features for neural speech synthesis

Mar 31, 2022
Hubert Siuzdak, Piotr Dura, Pol van Rijn, Nori Jacoby

Figure 1 for WavThruVec: Latent speech representation as intermediate features for neural speech synthesis
Figure 2 for WavThruVec: Latent speech representation as intermediate features for neural speech synthesis
Figure 3 for WavThruVec: Latent speech representation as intermediate features for neural speech synthesis
Figure 4 for WavThruVec: Latent speech representation as intermediate features for neural speech synthesis
Viaarxiv icon

SCP-GAN: Self-Correcting Discriminator Optimization for Training Consistency Preserving Metric GAN on Speech Enhancement Tasks

Oct 26, 2022
Vasily Zadorozhnyy, Qiang Ye, Kazuhito Koishida

Figure 1 for SCP-GAN: Self-Correcting Discriminator Optimization for Training Consistency Preserving Metric GAN on Speech Enhancement Tasks
Figure 2 for SCP-GAN: Self-Correcting Discriminator Optimization for Training Consistency Preserving Metric GAN on Speech Enhancement Tasks
Figure 3 for SCP-GAN: Self-Correcting Discriminator Optimization for Training Consistency Preserving Metric GAN on Speech Enhancement Tasks
Viaarxiv icon

GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block

Jun 30, 2022
Xinmeng Xu, Yang Wang, Jie Jia, Binbin Chen, Jianjun Hao

Figure 1 for GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block
Figure 2 for GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block
Figure 3 for GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block
Figure 4 for GLD-Net: Improving Monaural Speech Enhancement by Learning Global and Local Dependency Features with GLD Block
Viaarxiv icon

Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation

Mar 24, 2022
Ye Jia, Yifan Ding, Ankur Bapna, Colin Cherry, Yu Zhang, Alexis Conneau, Nobuyuki Morioka

Figure 1 for Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation
Figure 2 for Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation
Figure 3 for Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation
Figure 4 for Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation
Viaarxiv icon

Explainable Human-centered Traits from Head Motion and Facial Expression Dynamics

Feb 23, 2023
Surbhi Madan, Monika Gahalawat, Tanaya Guha, Roland Goecke, Ramanathan Subramanian

Figure 1 for Explainable Human-centered Traits from Head Motion and Facial Expression Dynamics
Figure 2 for Explainable Human-centered Traits from Head Motion and Facial Expression Dynamics
Figure 3 for Explainable Human-centered Traits from Head Motion and Facial Expression Dynamics
Figure 4 for Explainable Human-centered Traits from Head Motion and Facial Expression Dynamics
Viaarxiv icon

Generalization of Auto-Regressive Hidden Markov Models to Non-Linear Dynamics and Non-Euclidean Observation Space

Feb 23, 2023
Michele Ginesi, Paolo Fiorini

Figure 1 for Generalization of Auto-Regressive Hidden Markov Models to Non-Linear Dynamics and Non-Euclidean Observation Space
Figure 2 for Generalization of Auto-Regressive Hidden Markov Models to Non-Linear Dynamics and Non-Euclidean Observation Space
Figure 3 for Generalization of Auto-Regressive Hidden Markov Models to Non-Linear Dynamics and Non-Euclidean Observation Space
Figure 4 for Generalization of Auto-Regressive Hidden Markov Models to Non-Linear Dynamics and Non-Euclidean Observation Space
Viaarxiv icon

Incorporating Uncertainty from Speaker Embedding Estimation to Speaker Verification

Feb 23, 2023
Qiongqiong Wang, Kong Aik Lee, Tianchi Liu

Figure 1 for Incorporating Uncertainty from Speaker Embedding Estimation to Speaker Verification
Figure 2 for Incorporating Uncertainty from Speaker Embedding Estimation to Speaker Verification
Figure 3 for Incorporating Uncertainty from Speaker Embedding Estimation to Speaker Verification
Viaarxiv icon