Alert button

"speech recognition": models, code, and papers
Alert button

A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation

Add code
Bookmark button
Alert button
Oct 11, 2021
Yosuke Higuchi, Nanxin Chen, Yuya Fujita, Hirofumi Inaguma, Tatsuya Komatsu, Jaesong Lee, Jumon Nozaki, Tianzi Wang, Shinji Watanabe

Figure 1 for A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation
Figure 2 for A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation
Figure 3 for A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation
Figure 4 for A Comparative Study on Non-Autoregressive Modelings for Speech-to-Text Generation
Viaarxiv icon

Self-supervised reinforcement learning for speaker localisation with the iCub humanoid robot

Nov 12, 2020
Jonas Gonzalez-Billandon, Lukas Grasse, Matthew Tata, Alessandra Sciutti, Francesco Rea

Figure 1 for Self-supervised reinforcement learning for speaker localisation with the iCub humanoid robot
Figure 2 for Self-supervised reinforcement learning for speaker localisation with the iCub humanoid robot
Figure 3 for Self-supervised reinforcement learning for speaker localisation with the iCub humanoid robot
Figure 4 for Self-supervised reinforcement learning for speaker localisation with the iCub humanoid robot
Viaarxiv icon

Lightweight dynamic filter for keyword spotting

Sep 27, 2021
Donghyeon Kim, Kyungdeuk Ko, Jeong-gi Kwak, David K. Han, Hanseok Ko

Figure 1 for Lightweight dynamic filter for keyword spotting
Figure 2 for Lightweight dynamic filter for keyword spotting
Figure 3 for Lightweight dynamic filter for keyword spotting
Figure 4 for Lightweight dynamic filter for keyword spotting
Viaarxiv icon

Long Expressive Memory for Sequence Modeling

Add code
Bookmark button
Alert button
Oct 10, 2021
T. Konstantin Rusch, Siddhartha Mishra, N. Benjamin Erichson, Michael W. Mahoney

Figure 1 for Long Expressive Memory for Sequence Modeling
Figure 2 for Long Expressive Memory for Sequence Modeling
Figure 3 for Long Expressive Memory for Sequence Modeling
Figure 4 for Long Expressive Memory for Sequence Modeling
Viaarxiv icon

Prompt-tuning in ASR systems for efficient domain-adaptation

Oct 22, 2021
Saket Dingliwal, Ashish Shenoy, Sravan Bodapati, Ankur Gandhe, Ravi Teja Gadde, Katrin Kirchhoff

Figure 1 for Prompt-tuning in ASR systems for efficient domain-adaptation
Viaarxiv icon

SEMOUR: A Scripted Emotional Speech Repository for Urdu

May 19, 2021
Nimra Zaheer, Obaid Ullah Ahmad, Ammar Ahmed, Muhammad Shehryar Khan, Mudassir Shabbir

Figure 1 for SEMOUR: A Scripted Emotional Speech Repository for Urdu
Figure 2 for SEMOUR: A Scripted Emotional Speech Repository for Urdu
Figure 3 for SEMOUR: A Scripted Emotional Speech Repository for Urdu
Figure 4 for SEMOUR: A Scripted Emotional Speech Repository for Urdu
Viaarxiv icon

Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone

Apr 12, 2021
Naoyuki Kanda, Guoli Ye, Yu Wu, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka

Figure 1 for Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Figure 2 for Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Figure 3 for Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone
Viaarxiv icon

One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement

Oct 20, 2021
Hassan Taherian, Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Zhuo Chen, Xuedong Huang

Figure 1 for One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
Figure 2 for One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
Figure 3 for One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
Figure 4 for One model to enhance them all: array geometry agnostic multi-channel personalized speech enhancement
Viaarxiv icon

Automatic Learning of Subword Dependent Model Scales

Add code
Bookmark button
Alert button
Oct 18, 2021
Felix Meyer, Wilfried Michel, Mohammad Zeineldeen, Ralf Schlüter, Hermann Ney

Figure 1 for Automatic Learning of Subword Dependent Model Scales
Figure 2 for Automatic Learning of Subword Dependent Model Scales
Figure 3 for Automatic Learning of Subword Dependent Model Scales
Figure 4 for Automatic Learning of Subword Dependent Model Scales
Viaarxiv icon