Alert button

"speech": models, code, and papers
Alert button

Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation

Dec 07, 2021
Yingruo Fan, Zhaojiang Lin, Jun Saito, Wenping Wang, Taku Komura

Figure 1 for Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation
Figure 2 for Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation
Figure 3 for Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation
Figure 4 for Joint Audio-Text Model for Expressive Speech-Driven 3D Facial Animation
Viaarxiv icon

Mix and Localize: Localizing Sound Sources in Mixtures

Add code
Bookmark button
Alert button
Nov 28, 2022
Xixi Hu, Ziyang Chen, Andrew Owens

Figure 1 for Mix and Localize: Localizing Sound Sources in Mixtures
Figure 2 for Mix and Localize: Localizing Sound Sources in Mixtures
Figure 3 for Mix and Localize: Localizing Sound Sources in Mixtures
Figure 4 for Mix and Localize: Localizing Sound Sources in Mixtures
Viaarxiv icon

Scaling Up Deliberation for Multilingual ASR

Oct 11, 2022
Ke Hu, Bo Li, Tara N. Sainath

Figure 1 for Scaling Up Deliberation for Multilingual ASR
Figure 2 for Scaling Up Deliberation for Multilingual ASR
Figure 3 for Scaling Up Deliberation for Multilingual ASR
Figure 4 for Scaling Up Deliberation for Multilingual ASR
Viaarxiv icon

Probing Deep Speaker Embeddings for Speaker-related Tasks

Add code
Bookmark button
Alert button
Dec 14, 2022
Zifeng Zhao, Ding Pan, Junyi Peng, Rongzhi Gu

Figure 1 for Probing Deep Speaker Embeddings for Speaker-related Tasks
Figure 2 for Probing Deep Speaker Embeddings for Speaker-related Tasks
Figure 3 for Probing Deep Speaker Embeddings for Speaker-related Tasks
Figure 4 for Probing Deep Speaker Embeddings for Speaker-related Tasks
Viaarxiv icon

The IWSLT 2021 BUT Speech Translation Systems

Jul 13, 2021
Hari Krishna Vydana, Martin Karafi'at, Luk'as Burget, "Honza" Cernock'y

Figure 1 for The IWSLT 2021 BUT Speech Translation Systems
Figure 2 for The IWSLT 2021 BUT Speech Translation Systems
Figure 3 for The IWSLT 2021 BUT Speech Translation Systems
Figure 4 for The IWSLT 2021 BUT Speech Translation Systems
Viaarxiv icon

Investigating self-supervised front ends for speech spoofing countermeasures

Add code
Bookmark button
Alert button
Nov 15, 2021
Xin Wang, Junichi Yamagishi

Figure 1 for Investigating self-supervised front ends for speech spoofing countermeasures
Figure 2 for Investigating self-supervised front ends for speech spoofing countermeasures
Figure 3 for Investigating self-supervised front ends for speech spoofing countermeasures
Figure 4 for Investigating self-supervised front ends for speech spoofing countermeasures
Viaarxiv icon

An Objective Evaluation Framework for Pathological Speech Synthesis

Add code
Bookmark button
Alert button
Jul 01, 2021
Bence Mark Halpern, Julian Fritsch, Enno Hermann, Rob van Son, Odette Scharenborg, Mathew Magimai. -Doss

Figure 1 for An Objective Evaluation Framework for Pathological Speech Synthesis
Figure 2 for An Objective Evaluation Framework for Pathological Speech Synthesis
Figure 3 for An Objective Evaluation Framework for Pathological Speech Synthesis
Figure 4 for An Objective Evaluation Framework for Pathological Speech Synthesis
Viaarxiv icon

Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge

Add code
Bookmark button
Alert button
Jul 29, 2022
Alef Iury Siqueira Ferreira, Gustavo dos Reis Oliveira

Figure 1 for Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
Figure 2 for Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
Figure 3 for Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
Figure 4 for Domain Specific Wav2vec 2.0 Fine-tuning For The SE&R 2022 Challenge
Viaarxiv icon

Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding

Oct 16, 2022
Ruchao Fan, Guoli Ye, Yashesh Gaur, Jinyu Li

Figure 1 for Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding
Figure 2 for Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding
Figure 3 for Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding
Figure 4 for Acoustic-aware Non-autoregressive Spell Correction with Mask Sample Decoding
Viaarxiv icon

Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features

Add code
Bookmark button
Alert button
Nov 03, 2021
Ryandhimas E. Zezario, Szu-Wei Fu, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao

Figure 1 for Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features
Figure 2 for Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features
Figure 3 for Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features
Figure 4 for Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features
Viaarxiv icon