Alert button

"speech recognition": models, code, and papers
Alert button

Personalized Speech Enhancement: New Models and Comprehensive Evaluation

Add code
Bookmark button
Alert button
Oct 18, 2021
Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Xiaofei Wang, Zhuo Chen, Xuedong Huang

Figure 1 for Personalized Speech Enhancement: New Models and Comprehensive Evaluation
Figure 2 for Personalized Speech Enhancement: New Models and Comprehensive Evaluation
Figure 3 for Personalized Speech Enhancement: New Models and Comprehensive Evaluation
Viaarxiv icon

Towards Online End-to-end Transformer Automatic Speech Recognition

Oct 25, 2019
Emiru Tsunoo, Yosuke Kashiwagi, Toshiyuki Kumakura, Shinji Watanabe

Figure 1 for Towards Online End-to-end Transformer Automatic Speech Recognition
Figure 2 for Towards Online End-to-end Transformer Automatic Speech Recognition
Figure 3 for Towards Online End-to-end Transformer Automatic Speech Recognition
Figure 4 for Towards Online End-to-end Transformer Automatic Speech Recognition
Viaarxiv icon

Streaming non-autoregressive model for any-to-many voice conversion

Add code
Bookmark button
Alert button
Jun 15, 2022
Ziyi Chen, Haoran Miao, Pengyuan Zhang

Figure 1 for Streaming non-autoregressive model for any-to-many voice conversion
Figure 2 for Streaming non-autoregressive model for any-to-many voice conversion
Figure 3 for Streaming non-autoregressive model for any-to-many voice conversion
Figure 4 for Streaming non-autoregressive model for any-to-many voice conversion
Viaarxiv icon

Pruned RNN-T for fast, memory-efficient ASR training

Add code
Bookmark button
Alert button
Jun 23, 2022
Fangjun Kuang, Liyong Guo, Wei Kang, Long Lin, Mingshuang Luo, Zengwei Yao, Daniel Povey

Figure 1 for Pruned RNN-T for fast, memory-efficient ASR training
Figure 2 for Pruned RNN-T for fast, memory-efficient ASR training
Figure 3 for Pruned RNN-T for fast, memory-efficient ASR training
Figure 4 for Pruned RNN-T for fast, memory-efficient ASR training
Viaarxiv icon

End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection

Add code
Bookmark button
Alert button
Feb 03, 2020
Takenori Yoshimura, Tomoki Hayashi, Kazuya Takeda, Shinji Watanabe

Figure 1 for End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection
Figure 2 for End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection
Figure 3 for End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection
Figure 4 for End-to-End Automatic Speech Recognition Integrated With CTC-Based Voice Activity Detection
Viaarxiv icon

Transformer-based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project

Add code
Bookmark button
Alert button
Jun 15, 2022
Jan Lehečka, Josef V. Psutka, Josef Psutka

Figure 1 for Transformer-based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project
Figure 2 for Transformer-based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project
Figure 3 for Transformer-based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project
Figure 4 for Transformer-based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project
Viaarxiv icon

Multi-view Frequency LSTM: An Efficient Frontend for Automatic Speech Recognition

Jun 30, 2020
Maarten Van Segbroeck, Harish Mallidih, Brian King, I-Fan Chen, Gurpreet Chadha, Roland Maas

Viaarxiv icon

Kite: Automatic speech recognition for unmanned aerial vehicles

Jul 02, 2019
Dan Oneata, Horia Cucu

Figure 1 for Kite: Automatic speech recognition for unmanned aerial vehicles
Figure 2 for Kite: Automatic speech recognition for unmanned aerial vehicles
Figure 3 for Kite: Automatic speech recognition for unmanned aerial vehicles
Figure 4 for Kite: Automatic speech recognition for unmanned aerial vehicles
Viaarxiv icon

Encrypted Speech Recognition using Deep Polynomial Networks

May 11, 2019
Shi-Xiong Zhang, Yifan Gong, Dong Yu

Figure 1 for Encrypted Speech Recognition using Deep Polynomial Networks
Figure 2 for Encrypted Speech Recognition using Deep Polynomial Networks
Figure 3 for Encrypted Speech Recognition using Deep Polynomial Networks
Figure 4 for Encrypted Speech Recognition using Deep Polynomial Networks
Viaarxiv icon

ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers

Add code
Bookmark button
Alert button
Apr 20, 2020
Jung-Woo Ha, Kihyun Nam, Jin Gu Kang, Sang-Woo Lee, Sohee Yang, Hyunhoon Jung, Eunmi Kim, Hyeji Kim, Soojin Kim, Hyun Ah Kim, Kyoungtae Doh, Chan Kyu Lee, Sunghun Kim

Figure 1 for ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers
Figure 2 for ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers
Figure 3 for ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers
Figure 4 for ClovaCall: Korean Goal-Oriented Dialog Speech Corpus for Automatic Speech Recognition of Contact Centers
Viaarxiv icon