Alert button

"speech": models, code, and papers
Alert button

MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech

May 12, 2020
Jakob D. Havtorn, Jan Latko, Joakim Edin, Lasse Borgholt, Lars Maaløe, Lorenzo Belgrano, Nicolai F. Jacobsen, Regitze Sdun, Željko Agić

Figure 1 for MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech
Figure 2 for MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech
Figure 3 for MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech
Figure 4 for MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech
Viaarxiv icon

VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention

Add code
Bookmark button
Alert button
Feb 12, 2021
Peng Liu, Yuewen Cao, Songxiang Liu, Na Hu, Guangzhi Li, Chao Weng, Dan Su

Figure 1 for VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Figure 2 for VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Figure 3 for VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Figure 4 for VARA-TTS: Non-Autoregressive Text-to-Speech Synthesis based on Very Deep VAE with Residual Attention
Viaarxiv icon

Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces

Add code
Bookmark button
Alert button
Jun 18, 2021
Vanessa Hahn, Dana Ruiter, Thomas Kleinbauer, Dietrich Klakow

Figure 1 for Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces
Figure 2 for Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces
Figure 3 for Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces
Figure 4 for Modeling Profanity and Hate Speech in Social Media with Semantic Subspaces
Viaarxiv icon

Whither the Priors for (Vocal) Interactivity?

Mar 16, 2022
Roger K. Moore

Figure 1 for Whither the Priors for (Vocal) Interactivity?
Figure 2 for Whither the Priors for (Vocal) Interactivity?
Figure 3 for Whither the Priors for (Vocal) Interactivity?
Viaarxiv icon

Object Localization Assistive System Based on CV and Vibrotactile Encoding

Jun 19, 2022
Zhikai Wei, Xuhui Hu

Figure 1 for Object Localization Assistive System Based on CV and Vibrotactile Encoding
Figure 2 for Object Localization Assistive System Based on CV and Vibrotactile Encoding
Figure 3 for Object Localization Assistive System Based on CV and Vibrotactile Encoding
Figure 4 for Object Localization Assistive System Based on CV and Vibrotactile Encoding
Viaarxiv icon

Training Keyword Spotters with Limited and Synthesized Speech Data

Jan 31, 2020
James Lin, Kevin Kilgour, Dominik Roblek, Matthew Sharifi

Figure 1 for Training Keyword Spotters with Limited and Synthesized Speech Data
Figure 2 for Training Keyword Spotters with Limited and Synthesized Speech Data
Figure 3 for Training Keyword Spotters with Limited and Synthesized Speech Data
Figure 4 for Training Keyword Spotters with Limited and Synthesized Speech Data
Viaarxiv icon

Fast Classification Learning with Neural Networks and Conceptors for Speech Recognition and Car Driving Maneuvers

Feb 10, 2021
Stefanie Krause, Oliver Otto, Frieder Stolzenburg

Figure 1 for Fast Classification Learning with Neural Networks and Conceptors for Speech Recognition and Car Driving Maneuvers
Figure 2 for Fast Classification Learning with Neural Networks and Conceptors for Speech Recognition and Car Driving Maneuvers
Figure 3 for Fast Classification Learning with Neural Networks and Conceptors for Speech Recognition and Car Driving Maneuvers
Figure 4 for Fast Classification Learning with Neural Networks and Conceptors for Speech Recognition and Car Driving Maneuvers
Viaarxiv icon

Learning Audio Representations with MLPs

Mar 16, 2022
Mashrur M. Morshed, Ahmad Omar Ahsan, Hasan Mahmud, Md. Kamrul Hasan

Figure 1 for Learning Audio Representations with MLPs
Figure 2 for Learning Audio Representations with MLPs
Figure 3 for Learning Audio Representations with MLPs
Figure 4 for Learning Audio Representations with MLPs
Viaarxiv icon

Transformer-Based Multi-Aspect Multi-Granularity Non-Native English Speaker Pronunciation Assessment

Add code
Bookmark button
Alert button
May 06, 2022
Yuan Gong, Ziyi Chen, Iek-Heng Chu, Peng Chang, James Glass

Figure 1 for Transformer-Based Multi-Aspect Multi-Granularity Non-Native English Speaker Pronunciation Assessment
Figure 2 for Transformer-Based Multi-Aspect Multi-Granularity Non-Native English Speaker Pronunciation Assessment
Figure 3 for Transformer-Based Multi-Aspect Multi-Granularity Non-Native English Speaker Pronunciation Assessment
Figure 4 for Transformer-Based Multi-Aspect Multi-Granularity Non-Native English Speaker Pronunciation Assessment
Viaarxiv icon

Speaker-aware speech-transformer

Jan 02, 2020
Zhiyun Fan, Jie Li, Shiyu Zhou, Bo Xu

Figure 1 for Speaker-aware speech-transformer
Figure 2 for Speaker-aware speech-transformer
Figure 3 for Speaker-aware speech-transformer
Figure 4 for Speaker-aware speech-transformer
Viaarxiv icon