Alert button

"speech recognition": models, code, and papers
Alert button

Fusing Wav2vec2.0 and BERT into End-to-end Model for Low-resource Speech Recognition

Add code
Bookmark button
Alert button
Jan 17, 2021
Cheng Yi, Shiyu Zhou, Bo Xu

Figure 1 for Fusing Wav2vec2.0 and BERT into End-to-end Model for Low-resource Speech Recognition
Figure 2 for Fusing Wav2vec2.0 and BERT into End-to-end Model for Low-resource Speech Recognition
Figure 3 for Fusing Wav2vec2.0 and BERT into End-to-end Model for Low-resource Speech Recognition
Figure 4 for Fusing Wav2vec2.0 and BERT into End-to-end Model for Low-resource Speech Recognition
Viaarxiv icon

Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition

May 14, 2021
Khin Me Me Chit, Laet Laet Lin

Figure 1 for Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition
Figure 2 for Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition
Figure 3 for Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition
Figure 4 for Exploring CTC Based End-to-End Techniques for Myanmar Speech Recognition
Viaarxiv icon

Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms

Aug 21, 2022
Junghun Kim, Yoojin An, Jihie Kim

Figure 1 for Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms
Figure 2 for Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms
Figure 3 for Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms
Figure 4 for Improving Speech Emotion Recognition Through Focus and Calibration Attention Mechanisms
Viaarxiv icon

SoftCTC $\unicode{x2013}$ Semi-Supervised Learning for Text Recognition using Soft Pseudo-Labels

Add code
Bookmark button
Alert button
Dec 05, 2022
Martin Kišš, Michal Hradiš, Karel Beneš, Petr Buchal, Michal Kula

Figure 1 for SoftCTC $\unicode{x2013}$ Semi-Supervised Learning for Text Recognition using Soft Pseudo-Labels
Figure 2 for SoftCTC $\unicode{x2013}$ Semi-Supervised Learning for Text Recognition using Soft Pseudo-Labels
Figure 3 for SoftCTC $\unicode{x2013}$ Semi-Supervised Learning for Text Recognition using Soft Pseudo-Labels
Figure 4 for SoftCTC $\unicode{x2013}$ Semi-Supervised Learning for Text Recognition using Soft Pseudo-Labels
Viaarxiv icon

Dual-Decoder Transformer For end-to-end Mandarin Chinese Speech Recognition with Pinyin and Character

Add code
Bookmark button
Alert button
Jan 26, 2022
Zhao Yang, Wei Xi, Rui Wang, Rui Jiang, Jizhong Zhao

Figure 1 for Dual-Decoder Transformer For end-to-end Mandarin Chinese Speech Recognition with Pinyin and Character
Figure 2 for Dual-Decoder Transformer For end-to-end Mandarin Chinese Speech Recognition with Pinyin and Character
Figure 3 for Dual-Decoder Transformer For end-to-end Mandarin Chinese Speech Recognition with Pinyin and Character
Figure 4 for Dual-Decoder Transformer For end-to-end Mandarin Chinese Speech Recognition with Pinyin and Character
Viaarxiv icon

wav2letter++: The Fastest Open-source Speech Recognition System

Add code
Bookmark button
Alert button
Dec 18, 2018
Vineel Pratap, Awni Hannun, Qiantong Xu, Jeff Cai, Jacob Kahn, Gabriel Synnaeve, Vitaliy Liptchinsky, Ronan Collobert

Figure 1 for wav2letter++: The Fastest Open-source Speech Recognition System
Figure 2 for wav2letter++: The Fastest Open-source Speech Recognition System
Figure 3 for wav2letter++: The Fastest Open-source Speech Recognition System
Figure 4 for wav2letter++: The Fastest Open-source Speech Recognition System
Viaarxiv icon

Tourist Guidance Robot Based on HyperCLOVA

Oct 19, 2022
Takato Yamazaki, Katsumasa Yoshikawa, Toshiki Kawamoto, Masaya Ohagi, Tomoya Mizumoto, Shuta Ichimura, Yusuke Kida, Toshinori Sato

Figure 1 for Tourist Guidance Robot Based on HyperCLOVA
Figure 2 for Tourist Guidance Robot Based on HyperCLOVA
Figure 3 for Tourist Guidance Robot Based on HyperCLOVA
Figure 4 for Tourist Guidance Robot Based on HyperCLOVA
Viaarxiv icon

Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language

Add code
Bookmark button
Alert button
Dec 14, 2022
Alexei Baevski, Arun Babu, Wei-Ning Hsu, Michael Auli

Figure 1 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Figure 2 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Figure 3 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Figure 4 for Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language
Viaarxiv icon

A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Jan 22, 2022
Qiu-Shi Zhu, Jie Zhang, Zi-Qiang Zhang, Ming-Hui Wu, Xin Fang, Li-Rong Dai

Figure 1 for A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition
Figure 2 for A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition
Figure 3 for A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition
Figure 4 for A Noise-Robust Self-supervised Pre-training Model Based Speech Representation Learning for Automatic Speech Recognition
Viaarxiv icon