Alert button

"speech": models, code, and papers
Alert button

Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition

Jul 02, 2021
Niko Moritz, Takaaki Hori, Jonathan Le Roux

Figure 1 for Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition
Figure 2 for Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition
Viaarxiv icon

SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization

Add code
Bookmark button
Alert button
May 16, 2022
Yuhta Takida, Takashi Shibuya, WeiHsiang Liao, Chieh-Hsin Lai, Junki Ohmura, Toshimitsu Uesaka, Naoki Murata, Shusuke Takahashi, Toshiyuki Kumakura, Yuki Mitsufuji

Figure 1 for SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization
Figure 2 for SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization
Figure 3 for SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization
Figure 4 for SQ-VAE: Variational Bayes on Discrete Representation with Self-annealed Stochastic Quantization
Viaarxiv icon

FastSpeech 2: Fast and High-Quality End-to-End Text to Speech

Add code
Bookmark button
Alert button
Jun 22, 2020
Yi Ren, Chenxu Hu, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, Tie-Yan Liu

Figure 1 for FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Figure 2 for FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Figure 3 for FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Figure 4 for FastSpeech 2: Fast and High-Quality End-to-End Text to Speech
Viaarxiv icon

Estimating articulatory movements in speech production with transformer networks

Add code
Bookmark button
Alert button
Apr 11, 2021
Sathvik Udupa, Anwesha Roy, Abhayjeet Singh, Aravind Illa, Prasanta Kumar Ghosh

Figure 1 for Estimating articulatory movements in speech production with transformer networks
Figure 2 for Estimating articulatory movements in speech production with transformer networks
Figure 3 for Estimating articulatory movements in speech production with transformer networks
Figure 4 for Estimating articulatory movements in speech production with transformer networks
Viaarxiv icon

Robustness of end-to-end Automatic Speech Recognition Models -- A Case Study using Mozilla DeepSpeech

Add code
Bookmark button
Alert button
May 08, 2021
Aashish Agarwal, Torsten Zesch

Figure 1 for Robustness of end-to-end Automatic Speech Recognition Models -- A Case Study using Mozilla DeepSpeech
Figure 2 for Robustness of end-to-end Automatic Speech Recognition Models -- A Case Study using Mozilla DeepSpeech
Figure 3 for Robustness of end-to-end Automatic Speech Recognition Models -- A Case Study using Mozilla DeepSpeech
Figure 4 for Robustness of end-to-end Automatic Speech Recognition Models -- A Case Study using Mozilla DeepSpeech
Viaarxiv icon

DeepSpectrumLite: A Power-Efficient Transfer Learning Framework for Embedded Speech and Audio Processing from Decentralised Data

Add code
Bookmark button
Alert button
Apr 23, 2021
Shahin Amiriparian, Tobias Hübner, Maurice Gerczuk, Sandra Ottl, Björn W. Schuller

Figure 1 for DeepSpectrumLite: A Power-Efficient Transfer Learning Framework for Embedded Speech and Audio Processing from Decentralised Data
Figure 2 for DeepSpectrumLite: A Power-Efficient Transfer Learning Framework for Embedded Speech and Audio Processing from Decentralised Data
Figure 3 for DeepSpectrumLite: A Power-Efficient Transfer Learning Framework for Embedded Speech and Audio Processing from Decentralised Data
Figure 4 for DeepSpectrumLite: A Power-Efficient Transfer Learning Framework for Embedded Speech and Audio Processing from Decentralised Data
Viaarxiv icon

KUCST@LT-EDI-ACL2022: Detecting Signs of Depression from Social Media Text

Add code
Bookmark button
Alert button
Apr 09, 2022
Manex Agirrezabal, Janek Amann

Figure 1 for KUCST@LT-EDI-ACL2022: Detecting Signs of Depression from Social Media Text
Figure 2 for KUCST@LT-EDI-ACL2022: Detecting Signs of Depression from Social Media Text
Figure 3 for KUCST@LT-EDI-ACL2022: Detecting Signs of Depression from Social Media Text
Figure 4 for KUCST@LT-EDI-ACL2022: Detecting Signs of Depression from Social Media Text
Viaarxiv icon

FAIR4Cov: Fused Audio Instance and Representation for COVID-19 Detection

Add code
Bookmark button
Alert button
Apr 22, 2022
Tuan Truong, Matthias Lenga, Antoine Serrurier, Sadegh Mohammadi

Figure 1 for FAIR4Cov: Fused Audio Instance and Representation for COVID-19 Detection
Figure 2 for FAIR4Cov: Fused Audio Instance and Representation for COVID-19 Detection
Figure 3 for FAIR4Cov: Fused Audio Instance and Representation for COVID-19 Detection
Figure 4 for FAIR4Cov: Fused Audio Instance and Representation for COVID-19 Detection
Viaarxiv icon

End-to-End Speech Recognition from Federated Acoustic Models

Add code
Bookmark button
Alert button
Apr 29, 2021
Yan Gao, Titouan Parcollet, Javier Fernandez-Marques, Pedro P. B. de Gusmao, Daniel J. Beutel, Nicholas D. Lane

Figure 1 for End-to-End Speech Recognition from Federated Acoustic Models
Figure 2 for End-to-End Speech Recognition from Federated Acoustic Models
Figure 3 for End-to-End Speech Recognition from Federated Acoustic Models
Viaarxiv icon

Deep Double-Side Learning Ensemble Model for Few-Shot Parkinson Speech Recognition

Jun 20, 2020
Yongming Li, Lang Zhou, Lingyun Qin, Yuwei Zeng, Yuchuan Liu, Yan Lei, Pin Wang, Fan Li

Figure 1 for Deep Double-Side Learning Ensemble Model for Few-Shot Parkinson Speech Recognition
Figure 2 for Deep Double-Side Learning Ensemble Model for Few-Shot Parkinson Speech Recognition
Figure 3 for Deep Double-Side Learning Ensemble Model for Few-Shot Parkinson Speech Recognition
Figure 4 for Deep Double-Side Learning Ensemble Model for Few-Shot Parkinson Speech Recognition
Viaarxiv icon