Alert button

"speech": models, code, and papers
Alert button

Self-Supervised Speech Representation Learning: A Review

Add code
Bookmark button
Alert button
May 21, 2022
Abdelrahman Mohamed, Hung-yi Lee, Lasse Borgholt, Jakob D. Havtorn, Joakim Edin, Christian Igel, Katrin Kirchhoff, Shang-Wen Li, Karen Livescu, Lars Maaløe, Tara N. Sainath, Shinji Watanabe

Figure 1 for Self-Supervised Speech Representation Learning: A Review
Figure 2 for Self-Supervised Speech Representation Learning: A Review
Figure 3 for Self-Supervised Speech Representation Learning: A Review
Figure 4 for Self-Supervised Speech Representation Learning: A Review
Viaarxiv icon

End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation

Add code
Bookmark button
Alert button
Apr 01, 2022
Xuankai Chang, Takashi Maekaku, Yuya Fujita, Shinji Watanabe

Figure 1 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Figure 2 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Figure 3 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Figure 4 for End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Viaarxiv icon

Learning Audio-Driven Viseme Dynamics for 3D Face Animation

Add code
Bookmark button
Alert button
Jan 15, 2023
Linchao Bao, Haoxian Zhang, Yue Qian, Tangli Xue, Changhai Chen, Xuefei Zhe, Di Kang

Figure 1 for Learning Audio-Driven Viseme Dynamics for 3D Face Animation
Figure 2 for Learning Audio-Driven Viseme Dynamics for 3D Face Animation
Figure 3 for Learning Audio-Driven Viseme Dynamics for 3D Face Animation
Figure 4 for Learning Audio-Driven Viseme Dynamics for 3D Face Animation
Viaarxiv icon

Learning Speaker-specific Lip-to-Speech Generation

Jun 04, 2022
Munender Varshney, Ravindra Yadav, Vinay P. Namboodiri, Rajesh M Hegde

Figure 1 for Learning Speaker-specific Lip-to-Speech Generation
Figure 2 for Learning Speaker-specific Lip-to-Speech Generation
Figure 3 for Learning Speaker-specific Lip-to-Speech Generation
Figure 4 for Learning Speaker-specific Lip-to-Speech Generation
Viaarxiv icon

Contrastive Representation Learning for Acoustic Parameter Estimation

Mar 13, 2023
Philipp Götz, Cagdas Tuna, Andreas Walther, Emanuël A. P. Habets

Figure 1 for Contrastive Representation Learning for Acoustic Parameter Estimation
Figure 2 for Contrastive Representation Learning for Acoustic Parameter Estimation
Figure 3 for Contrastive Representation Learning for Acoustic Parameter Estimation
Figure 4 for Contrastive Representation Learning for Acoustic Parameter Estimation
Viaarxiv icon

Improving Noisy Student Training on Non-target Domain Data for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Nov 09, 2022
Yu Chen, Wen Ding, Junjie Lai

Figure 1 for Improving Noisy Student Training on Non-target Domain Data for Automatic Speech Recognition
Figure 2 for Improving Noisy Student Training on Non-target Domain Data for Automatic Speech Recognition
Figure 3 for Improving Noisy Student Training on Non-target Domain Data for Automatic Speech Recognition
Figure 4 for Improving Noisy Student Training on Non-target Domain Data for Automatic Speech Recognition
Viaarxiv icon

Deploying Enhanced Speech Feature Decreased Audio Complaints at SVT Play VOD Service

Aug 18, 2022
Annika Bidner, Julia Lindberg, Olof Lindman, Kinga Skorupska

Viaarxiv icon

StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models

Add code
Bookmark button
Alert button
Dec 29, 2022
Yinghao Aaron Li, Cong Han, Nima Mesgarani

Figure 1 for StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models
Figure 2 for StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models
Figure 3 for StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models
Figure 4 for StyleTTS-VC: One-Shot Voice Conversion by Knowledge Transfer from Style-Based TTS Models
Viaarxiv icon

Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

Add code
Bookmark button
Alert button
Mar 25, 2022
Hung-Shin Lee, Pin-Yuan Chen, Yu Tsao, Hsin-Min Wang

Figure 1 for Speech-enhanced and Noise-aware Networks for Robust Speech Recognition
Figure 2 for Speech-enhanced and Noise-aware Networks for Robust Speech Recognition
Figure 3 for Speech-enhanced and Noise-aware Networks for Robust Speech Recognition
Figure 4 for Speech-enhanced and Noise-aware Networks for Robust Speech Recognition
Viaarxiv icon

SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate

Add code
Bookmark button
Alert button
Jul 13, 2022
Nabarun Goswami, Tatsuya Harada

Figure 1 for SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate
Figure 2 for SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate
Figure 3 for SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate
Figure 4 for SATTS: Speaker Attractor Text to Speech, Learning to Speak by Learning to Separate
Viaarxiv icon