Alert button

"speech": models, code, and papers
Alert button

Towards Model-Size Agnostic, Compute-Free, Memorization-based Inference of Deep Learning

Jul 14, 2023
Davide Giacomini, Maeesha Binte Hashem, Jeremiah Suarez, Swarup Bhunia, Amit Ranjan Trivedi

Figure 1 for Towards Model-Size Agnostic, Compute-Free, Memorization-based Inference of Deep Learning
Figure 2 for Towards Model-Size Agnostic, Compute-Free, Memorization-based Inference of Deep Learning
Figure 3 for Towards Model-Size Agnostic, Compute-Free, Memorization-based Inference of Deep Learning
Figure 4 for Towards Model-Size Agnostic, Compute-Free, Memorization-based Inference of Deep Learning
Viaarxiv icon

RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness

Feb 18, 2023
Heitor R. Guimarães, Arthur Pimentel, Anderson R. Avila, Mehdi Rezagholizadeh, Boxing Chen, Tiago H. Falk

Figure 1 for RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness
Figure 2 for RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness
Figure 3 for RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness
Figure 4 for RobustDistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness
Viaarxiv icon

UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units

Add code
Bookmark button
Alert button
Dec 15, 2022
Hirofumi Inaguma, Sravya Popuri, Ilia Kulikov, Peng-Jen Chen, Changhan Wang, Yu-An Chung, Yun Tang, Ann Lee, Shinji Watanabe, Juan Pino

Figure 1 for UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
Figure 2 for UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
Figure 3 for UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
Figure 4 for UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units
Viaarxiv icon

Dialog act guided contextual adapter for personalized speech recognition

Mar 31, 2023
Feng-Ju Chang, Thejaswi Muniyappa, Kanthashree Mysore Sathyendra, Kai Wei, Grant P. Strimel, Ross McGowan

Figure 1 for Dialog act guided contextual adapter for personalized speech recognition
Figure 2 for Dialog act guided contextual adapter for personalized speech recognition
Figure 3 for Dialog act guided contextual adapter for personalized speech recognition
Figure 4 for Dialog act guided contextual adapter for personalized speech recognition
Viaarxiv icon

Audio-Driven Co-Speech Gesture Video Generation

Add code
Bookmark button
Alert button
Dec 05, 2022
Xian Liu, Qianyi Wu, Hang Zhou, Yuanqi Du, Wayne Wu, Dahua Lin, Ziwei Liu

Figure 1 for Audio-Driven Co-Speech Gesture Video Generation
Figure 2 for Audio-Driven Co-Speech Gesture Video Generation
Figure 3 for Audio-Driven Co-Speech Gesture Video Generation
Figure 4 for Audio-Driven Co-Speech Gesture Video Generation
Viaarxiv icon

BIG-C: a Multimodal Multi-Purpose Dataset for Bemba

Add code
Bookmark button
Alert button
May 26, 2023
Claytone Sikasote, Eunice Mukonde, Md Mahfuz Ibn Alam, Antonios Anastasopoulos

Figure 1 for BIG-C: a Multimodal Multi-Purpose Dataset for Bemba
Figure 2 for BIG-C: a Multimodal Multi-Purpose Dataset for Bemba
Figure 3 for BIG-C: a Multimodal Multi-Purpose Dataset for Bemba
Figure 4 for BIG-C: a Multimodal Multi-Purpose Dataset for Bemba
Viaarxiv icon

Robustness of Multi-Source MT to Transcription Errors

Add code
Bookmark button
Alert button
May 26, 2023
Dominik Macháček, Peter Polák, Ondřej Bojar, Raj Dabre

Figure 1 for Robustness of Multi-Source MT to Transcription Errors
Figure 2 for Robustness of Multi-Source MT to Transcription Errors
Figure 3 for Robustness of Multi-Source MT to Transcription Errors
Figure 4 for Robustness of Multi-Source MT to Transcription Errors
Viaarxiv icon

LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT

Add code
Bookmark button
Alert button
Jul 07, 2023
Le Zhuo, Ruibin Yuan, Jiahao Pan, Yinghao Ma, Yizhi LI, Ge Zhang, Si Liu, Roger Dannenberg, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenhu Chen, Wei Xue, Yike Guo

Figure 1 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Figure 2 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Figure 3 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Figure 4 for LyricWhiz: Robust Multilingual Zero-shot Lyrics Transcription by Whispering to ChatGPT
Viaarxiv icon

Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages

Jul 03, 2023
Devang Kulshreshtha, Saket Dingliwal, Brady Houston, Sravan Bodapati

Figure 1 for Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages
Figure 2 for Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages
Figure 3 for Multilingual Contextual Adapters To Improve Custom Word Recognition In Low-resource Languages
Viaarxiv icon

Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition

Apr 03, 2023
Saumya Y. Sahai, Jing Liu, Thejaswi Muniyappa, Kanthashree M. Sathyendra, Anastasios Alexandridis, Grant P. Strimel, Ross McGowan, Ariya Rastrow, Feng-Ju Chang, Athanasios Mouchtaris, Siegfried Kunzmann

Figure 1 for Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Figure 2 for Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Figure 3 for Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Figure 4 for Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition
Viaarxiv icon