Alert button

"speech recognition": models, code, and papers
Alert button

ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding

Add code
Bookmark button
Alert button
Aug 30, 2021
Lingyun Feng, Jianwei Yu, Deng Cai, Songxiang Liu, Haitao Zheng, Yan Wang

Figure 1 for ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding
Figure 2 for ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding
Figure 3 for ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding
Figure 4 for ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding
Viaarxiv icon

A Multimodal Framework for Video Ads Understanding

Aug 29, 2021
Zejia Weng, Lingchen Meng, Rui Wang, Zuxuan Wu, Yu-Gang Jiang

Figure 1 for A Multimodal Framework for Video Ads Understanding
Figure 2 for A Multimodal Framework for Video Ads Understanding
Figure 3 for A Multimodal Framework for Video Ads Understanding
Figure 4 for A Multimodal Framework for Video Ads Understanding
Viaarxiv icon

A three-dimensional approach to Visual Speech Recognition using Discrete Cosine Transforms

Sep 07, 2016
Toni Heidenreich, Michael W. Spratling

Figure 1 for A three-dimensional approach to Visual Speech Recognition using Discrete Cosine Transforms
Figure 2 for A three-dimensional approach to Visual Speech Recognition using Discrete Cosine Transforms
Figure 3 for A three-dimensional approach to Visual Speech Recognition using Discrete Cosine Transforms
Figure 4 for A three-dimensional approach to Visual Speech Recognition using Discrete Cosine Transforms
Viaarxiv icon

A Comparison of Methods for OOV-word Recognition on a New Public Dataset

Add code
Bookmark button
Alert button
Jul 16, 2021
Rudolf A. Braun, Srikanth Madikeri, Petr Motlicek

Figure 1 for A Comparison of Methods for OOV-word Recognition on a New Public Dataset
Figure 2 for A Comparison of Methods for OOV-word Recognition on a New Public Dataset
Figure 3 for A Comparison of Methods for OOV-word Recognition on a New Public Dataset
Figure 4 for A Comparison of Methods for OOV-word Recognition on a New Public Dataset
Viaarxiv icon

Animal inspired Application of a Variant of Mel Spectrogram for Seismic Data Processing

Sep 22, 2021
Samayan Bhattacharya, Sk Shahnawaz

Figure 1 for Animal inspired Application of a Variant of Mel Spectrogram for Seismic Data Processing
Figure 2 for Animal inspired Application of a Variant of Mel Spectrogram for Seismic Data Processing
Figure 3 for Animal inspired Application of a Variant of Mel Spectrogram for Seismic Data Processing
Figure 4 for Animal inspired Application of a Variant of Mel Spectrogram for Seismic Data Processing
Viaarxiv icon

Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-order Latent Domain

Add code
Bookmark button
Alert button
Oct 10, 2021
Zengwei Yao, Wenjie Pei, Fanglin Chen, Guangming Lu, David Zhang

Figure 1 for Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-order Latent Domain
Figure 2 for Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-order Latent Domain
Figure 3 for Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-order Latent Domain
Figure 4 for Stepwise-Refining Speech Separation Network via Fine-Grained Encoding in High-order Latent Domain
Viaarxiv icon

Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition

Jun 14, 2021
Vimal Manohar, Tatiana Likhomanenko, Qiantong Xu, Wei-Ning Hsu, Ronan Collobert, Yatharth Saraf, Geoffrey Zweig, Abdelrahman Mohamed

Figure 1 for Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition
Figure 2 for Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition
Figure 3 for Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition
Figure 4 for Kaizen: Continuously improving teacher using Exponential Moving Average for semi-supervised speech recognition
Viaarxiv icon

Combined Acoustic and Pronunciation Modelling for Non-Native Speech Recognition

Nov 06, 2007
Ghazi Bouselmi, Dominique Fohr, Irina Illina

Figure 1 for Combined Acoustic and Pronunciation Modelling for Non-Native Speech Recognition
Figure 2 for Combined Acoustic and Pronunciation Modelling for Non-Native Speech Recognition
Figure 3 for Combined Acoustic and Pronunciation Modelling for Non-Native Speech Recognition
Figure 4 for Combined Acoustic and Pronunciation Modelling for Non-Native Speech Recognition
Viaarxiv icon

BSTC: A Large-Scale Chinese-English Speech Translation Dataset

Add code
Bookmark button
Alert button
Apr 09, 2021
Ruiqing Zhang, Xiyang Wang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Zhi Li, Haifeng Wang, Ying Chen, Qinfei Li

Figure 1 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 2 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 3 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 4 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Viaarxiv icon

End-to-End Automatic Speech Recognition with Deep Mutual Learning

Feb 16, 2021
Ryo Masumura, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Takanori Ashihara

Figure 1 for End-to-End Automatic Speech Recognition with Deep Mutual Learning
Figure 2 for End-to-End Automatic Speech Recognition with Deep Mutual Learning
Viaarxiv icon