Alert button

"speech recognition": models, code, and papers
Alert button

Cycle-consistency training for end-to-end speech recognition

Nov 02, 2018
Takaaki Hori, Ramon Astudillo, Tomoki Hayashi, Yu Zhang, Shinji Watanabe, Jonathan Le Roux

Figure 1 for Cycle-consistency training for end-to-end speech recognition
Figure 2 for Cycle-consistency training for end-to-end speech recognition
Figure 3 for Cycle-consistency training for end-to-end speech recognition
Figure 4 for Cycle-consistency training for end-to-end speech recognition
Viaarxiv icon

Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator

May 18, 2022
Guangzhi Sun, Chao Zhang, Philip C Woodland

Figure 1 for Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
Figure 2 for Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
Figure 3 for Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
Figure 4 for Minimising Biasing Word Errors for Contextual ASR with the Tree-Constrained Pointer Generator
Viaarxiv icon

Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition

Apr 07, 2022
Qijie Shao, Jinghao Yan, Jian Kang, Pengcheng Guo, Xian Shi, Pengfei Hu, Lei Xie

Figure 1 for Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition
Figure 2 for Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition
Figure 3 for Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition
Figure 4 for Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition
Viaarxiv icon

Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model

Add code
Bookmark button
Alert button
Apr 07, 2022
Nick J. C. Wang, Lu Wang, Yandan Sun, Haimei Kang, Dejun Zhang

Figure 1 for Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model
Figure 2 for Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model
Figure 3 for Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model
Figure 4 for Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model
Viaarxiv icon

Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension

Add code
Bookmark button
Alert button
Apr 01, 2018
Chia-Hsuan Li, Szu-Lin Wu, Chi-Liang Liu, Hung-yi Lee

Figure 1 for Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension
Figure 2 for Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension
Figure 3 for Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension
Figure 4 for Spoken SQuAD: A Study of Mitigating the Impact of Speech Recognition Errors on Listening Comprehension
Viaarxiv icon

Algorithms for Speech Recognition and Language Processing

Sep 17, 1996
Mehryar Mohri, Michael Riley, Richard Sproat

Viaarxiv icon

Star Temporal Classification: Sequence Classification with Partially Labeled Data

Add code
Bookmark button
Alert button
Jan 28, 2022
Vineel Pratap, Awni Hannun, Gabriel Synnaeve, Ronan Collobert

Figure 1 for Star Temporal Classification: Sequence Classification with Partially Labeled Data
Figure 2 for Star Temporal Classification: Sequence Classification with Partially Labeled Data
Figure 3 for Star Temporal Classification: Sequence Classification with Partially Labeled Data
Figure 4 for Star Temporal Classification: Sequence Classification with Partially Labeled Data
Viaarxiv icon

Improving the Robustness of DistilHuBERT to Unseen Noisy Conditions via Data Augmentation, Curriculum Learning, and Multi-Task Enhancement

Nov 12, 2022
Heitor R. Guimarães, Arthur Pimentel, Anderson R. Avila, Mehdi Rezagholizadeh, Tiago H. Falk

Figure 1 for Improving the Robustness of DistilHuBERT to Unseen Noisy Conditions via Data Augmentation, Curriculum Learning, and Multi-Task Enhancement
Viaarxiv icon

A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network

Jul 11, 2018
Hosung Park, Donghyun Lee, Minkyu Lim, Yoseb Kang, Juneseok Oh, Ji-Hwan Kim

Figure 1 for A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network
Figure 2 for A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network
Figure 3 for A Fast-Converged Acoustic Modeling for Korean Speech Recognition: A Preliminary Study on Time Delay Neural Network
Viaarxiv icon

Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition

Mar 09, 2020
Yuanhang Zhang, Shuang Yang, Jingyun Xiao, Shiguang Shan, Xilin Chen

Figure 1 for Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition
Figure 2 for Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition
Figure 3 for Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition
Figure 4 for Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition
Viaarxiv icon