Alert button

"speech recognition": models, code, and papers
Alert button

ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding

Aug 30, 2021
Lingyun Feng, Jianwei Yu, Deng Cai, Songxiang Liu, Haitao Zheng, Yan Wang

Figure 1 for ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding
Figure 2 for ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding
Figure 3 for ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding
Figure 4 for ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding
Viaarxiv icon

Animal inspired Application of a Variant of Mel Spectrogram for Seismic Data Processing

Sep 22, 2021
Samayan Bhattacharya, Sk Shahnawaz

Figure 1 for Animal inspired Application of a Variant of Mel Spectrogram for Seismic Data Processing
Figure 2 for Animal inspired Application of a Variant of Mel Spectrogram for Seismic Data Processing
Figure 3 for Animal inspired Application of a Variant of Mel Spectrogram for Seismic Data Processing
Figure 4 for Animal inspired Application of a Variant of Mel Spectrogram for Seismic Data Processing
Viaarxiv icon

Recurrent Models for Auditory Attention in Multi-Microphone Distance Speech Recognition

Jan 07, 2016
Suyoun Kim, Ian Lane

Figure 1 for Recurrent Models for Auditory Attention in Multi-Microphone Distance Speech Recognition
Figure 2 for Recurrent Models for Auditory Attention in Multi-Microphone Distance Speech Recognition
Viaarxiv icon

A Multimodal Framework for Video Ads Understanding

Aug 29, 2021
Zejia Weng, Lingchen Meng, Rui Wang, Zuxuan Wu, Yu-Gang Jiang

Figure 1 for A Multimodal Framework for Video Ads Understanding
Figure 2 for A Multimodal Framework for Video Ads Understanding
Figure 3 for A Multimodal Framework for Video Ads Understanding
Figure 4 for A Multimodal Framework for Video Ads Understanding
Viaarxiv icon

A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese

May 18, 2018
Shiyu Zhou, Linhao Dong, Shuang Xu, Bo Xu

Figure 1 for A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese
Figure 2 for A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese
Figure 3 for A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese
Figure 4 for A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese
Viaarxiv icon

Dynamic Layer Customization for Noise Robust Speech Emotion Recognition in Heterogeneous Condition Training

Oct 21, 2020
Alex Wilf, Emily Mower Provost

Figure 1 for Dynamic Layer Customization for Noise Robust Speech Emotion Recognition in Heterogeneous Condition Training
Viaarxiv icon

BSTC: A Large-Scale Chinese-English Speech Translation Dataset

Apr 19, 2021
Ruiqing Zhang, Xiyang Wang, Chuanqiang Zhang, Zhongjun He, Hua Wu, Zhi Li, Haifeng Wang, Ying Chen, Qinfei Li

Figure 1 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 2 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 3 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Figure 4 for BSTC: A Large-Scale Chinese-English Speech Translation Dataset
Viaarxiv icon

Cross-Attention End-to-End ASR for Two-Party Conversations

Jul 24, 2019
Suyoun Kim, Siddharth Dalmia, Florian Metze

Figure 1 for Cross-Attention End-to-End ASR for Two-Party Conversations
Figure 2 for Cross-Attention End-to-End ASR for Two-Party Conversations
Figure 3 for Cross-Attention End-to-End ASR for Two-Party Conversations
Viaarxiv icon

Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy

Oct 11, 2021
Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori

Figure 1 for Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Figure 2 for Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Figure 3 for Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy
Viaarxiv icon

Knowing What to Listen to: Early Attention for Deep Speech Representation Learning

Sep 03, 2020
Amirhossein Hajavi, Ali Etemad

Figure 1 for Knowing What to Listen to: Early Attention for Deep Speech Representation Learning
Figure 2 for Knowing What to Listen to: Early Attention for Deep Speech Representation Learning
Figure 3 for Knowing What to Listen to: Early Attention for Deep Speech Representation Learning
Figure 4 for Knowing What to Listen to: Early Attention for Deep Speech Representation Learning
Viaarxiv icon