Alert button

"speech": models, code, and papers
Alert button

Universal Adversarial Perturbations for Speech Recognition Systems

May 09, 2019
Paarth Neekhara, Shehzeen Hussain, Prakhar Pandey, Shlomo Dubnov, Julian McAuley, Farinaz Koushanfar

Figure 1 for Universal Adversarial Perturbations for Speech Recognition Systems
Figure 2 for Universal Adversarial Perturbations for Speech Recognition Systems
Figure 3 for Universal Adversarial Perturbations for Speech Recognition Systems
Figure 4 for Universal Adversarial Perturbations for Speech Recognition Systems
Viaarxiv icon

Phone Duration Modeling for Speaker Age Estimation in Children

Sep 03, 2021
Prashanth Gurunath Shivakumar, Somer Bishop, Catherine Lord, Shrikanth Narayanan

Figure 1 for Phone Duration Modeling for Speaker Age Estimation in Children
Figure 2 for Phone Duration Modeling for Speaker Age Estimation in Children
Figure 3 for Phone Duration Modeling for Speaker Age Estimation in Children
Figure 4 for Phone Duration Modeling for Speaker Age Estimation in Children
Viaarxiv icon

Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training

Oct 08, 2019
Qiao Cheng, Meiyuan Fang, Yaqian Han, Jin Huang, Yitao Duan

Figure 1 for Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training
Figure 2 for Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training
Figure 3 for Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training
Figure 4 for Breaking the Data Barrier: Towards Robust Speech Translation via Adversarial Stability Training
Viaarxiv icon

ATCSpeech: a multilingual pilot-controller speech corpus from real Air Traffic Control environment

Nov 26, 2019
Bo Yang, Xianlong Tan, Zhengmao Chen, Bing Wang, Dan Li, Zhongping Yang, Xiping Wu, Yi Lin

Figure 1 for ATCSpeech: a multilingual pilot-controller speech corpus from real Air Traffic Control environment
Figure 2 for ATCSpeech: a multilingual pilot-controller speech corpus from real Air Traffic Control environment
Figure 3 for ATCSpeech: a multilingual pilot-controller speech corpus from real Air Traffic Control environment
Figure 4 for ATCSpeech: a multilingual pilot-controller speech corpus from real Air Traffic Control environment
Viaarxiv icon

Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners

Add code
Bookmark button
Alert button
May 29, 2022
Zhenhailong Wang, Manling Li, Ruochen Xu, Luowei Zhou, Jie Lei, Xudong Lin, Shuohang Wang, Ziyi Yang, Chenguang Zhu, Derek Hoiem, Shih-Fu Chang, Mohit Bansal, Heng Ji

Figure 1 for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Figure 2 for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Figure 3 for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Figure 4 for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
Viaarxiv icon

Survey on Deep Neural Networks in Speech and Vision Systems

Aug 16, 2019
Mahbubul Alam, Manar D. Samad, Lasitha Vidyaratne, Alexander Glandon, Khan M. Iftekharuddin

Figure 1 for Survey on Deep Neural Networks in Speech and Vision Systems
Figure 2 for Survey on Deep Neural Networks in Speech and Vision Systems
Figure 3 for Survey on Deep Neural Networks in Speech and Vision Systems
Figure 4 for Survey on Deep Neural Networks in Speech and Vision Systems
Viaarxiv icon

Multi-head Monotonic Chunkwise Attention For Online Speech Recognition

May 01, 2020
Baiji Liu, Songjun Cao, Sining Sun, Weibin Zhang, Long Ma

Figure 1 for Multi-head Monotonic Chunkwise Attention For Online Speech Recognition
Figure 2 for Multi-head Monotonic Chunkwise Attention For Online Speech Recognition
Figure 3 for Multi-head Monotonic Chunkwise Attention For Online Speech Recognition
Viaarxiv icon

Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems

Dec 10, 2021
Manaal Faruqui, Dilek Hakkani-Tür

Figure 1 for Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems
Figure 2 for Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems
Figure 3 for Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems
Figure 4 for Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems
Viaarxiv icon

A survey on recently proposed activation functions for Deep Learning

Apr 07, 2022
Murilo Gustineli

Figure 1 for A survey on recently proposed activation functions for Deep Learning
Figure 2 for A survey on recently proposed activation functions for Deep Learning
Viaarxiv icon

Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture

Jan 15, 2020
Haoran Miao, Gaofeng Cheng, Changfeng Gao, Pengyuan Zhang, Yonghong Yan

Figure 1 for Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Figure 2 for Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Figure 3 for Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Figure 4 for Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Viaarxiv icon