Alert button

"speech recognition": models, code, and papers
Alert button

BEA-Base: A Benchmark for ASR of Spontaneous Hungarian

Add code
Bookmark button
Alert button
Feb 01, 2022
P. Mihajlik, A. Balog, T. E. Gráczi, A. Kohári, B. Tarján, K. Mády

Figure 1 for BEA-Base: A Benchmark for ASR of Spontaneous Hungarian
Figure 2 for BEA-Base: A Benchmark for ASR of Spontaneous Hungarian
Figure 3 for BEA-Base: A Benchmark for ASR of Spontaneous Hungarian
Figure 4 for BEA-Base: A Benchmark for ASR of Spontaneous Hungarian
Viaarxiv icon

Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptation

Oct 27, 2022
Fernando López, Jordi Luque

Figure 1 for Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptation
Figure 2 for Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptation
Figure 3 for Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptation
Figure 4 for Iterative pseudo-forced alignment by acoustic CTC loss for self-supervised ASR domain adaptation
Viaarxiv icon

Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement Learning

Add code
Bookmark button
Alert button
Sep 16, 2022
Atsumoto Ohashi, Ryuichiro Higashinaka

Figure 1 for Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement Learning
Figure 2 for Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement Learning
Figure 3 for Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement Learning
Figure 4 for Adaptive Natural Language Generation for Task-oriented Dialogue via Reinforcement Learning
Viaarxiv icon

Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition

Mar 31, 2021
Sehoon Kim, Amir Gholami, Zhewei Yao, Anirudda Nrusimha, Bohan Zhai, Tianren Gao, Michael W. Mahoney, Kurt Keutzer

Figure 1 for Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition
Figure 2 for Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition
Figure 3 for Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition
Figure 4 for Q-ASR: Integer-only Zero-shot Quantization for Efficient Speech Recognition
Viaarxiv icon

Automatic Documentation of ICD Codes with Far-Field Speech Recognition

Nov 04, 2018
Albert Haque, Corinna Fukushima

Figure 1 for Automatic Documentation of ICD Codes with Far-Field Speech Recognition
Figure 2 for Automatic Documentation of ICD Codes with Far-Field Speech Recognition
Figure 3 for Automatic Documentation of ICD Codes with Far-Field Speech Recognition
Viaarxiv icon

Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition

Jun 09, 2020
Gurunath Reddy Madhumani, Sanket Shah, Basil Abraham, Vikas Joshi, Sunayana Sitaram

Figure 1 for Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition
Figure 2 for Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition
Figure 3 for Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition
Figure 4 for Learning not to Discriminate: Task Agnostic Learning for Improving Monolingual and Code-switched Speech Recognition
Viaarxiv icon

A comparable study of modeling units for end-to-end Mandarin speech recognition

May 14, 2018
Wei Zou, Dongwei Jiang, Shuaijiang Zhao, Xiangang Li

Figure 1 for A comparable study of modeling units for end-to-end Mandarin speech recognition
Figure 2 for A comparable study of modeling units for end-to-end Mandarin speech recognition
Figure 3 for A comparable study of modeling units for end-to-end Mandarin speech recognition
Figure 4 for A comparable study of modeling units for end-to-end Mandarin speech recognition
Viaarxiv icon

Deep LSTM for Large Vocabulary Continuous Speech Recognition

Mar 21, 2017
Xu Tian, Jun Zhang, Zejun Ma, Yi He, Juan Wei, Peihao Wu, Wenchang Situ, Shuai Li, Yang Zhang

Figure 1 for Deep LSTM for Large Vocabulary Continuous Speech Recognition
Figure 2 for Deep LSTM for Large Vocabulary Continuous Speech Recognition
Figure 3 for Deep LSTM for Large Vocabulary Continuous Speech Recognition
Figure 4 for Deep LSTM for Large Vocabulary Continuous Speech Recognition
Viaarxiv icon

Deep word embeddings for visual speech recognition

Add code
Bookmark button
Alert button
Oct 30, 2017
Themos Stafylakis, Georgios Tzimiropoulos

Figure 1 for Deep word embeddings for visual speech recognition
Figure 2 for Deep word embeddings for visual speech recognition
Figure 3 for Deep word embeddings for visual speech recognition
Figure 4 for Deep word embeddings for visual speech recognition
Viaarxiv icon

Squeezeformer: An Efficient Transformer for Automatic Speech Recognition

Jun 02, 2022
Sehoon Kim, Amir Gholami, Albert Shaw, Nicholas Lee, Karttikeya Mangalam, Jitendra Malik, Michael W. Mahoney, Kurt Keutzer

Figure 1 for Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Figure 2 for Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Figure 3 for Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Figure 4 for Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
Viaarxiv icon