Alert button

"speech recognition": models, code, and papers
Alert button

M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge

Add code
Bookmark button
Alert button
Oct 14, 2021
Fan Yu, Shiliang Zhang, Yihui Fu, Lei Xie, Siqi Zheng, Zhihao Du, Weilong Huang, Pengcheng Guo, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu

Figure 1 for M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge
Figure 2 for M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge
Figure 3 for M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge
Figure 4 for M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge
Viaarxiv icon

High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model

Mar 17, 2020
Jinyu Li, Rui Zhao, Eric Sun, Jeremy H. M. Wong, Amit Das, Zhong Meng, Yifan Gong

Figure 1 for High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model
Figure 2 for High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model
Figure 3 for High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model
Figure 4 for High-Accuracy and Low-Latency Speech Recognition with Two-Head Contextual Layer Trajectory LSTM Model
Viaarxiv icon

Towards speech-to-text translation without speech recognition

Add code
Bookmark button
Alert button
Feb 13, 2017
Sameer Bansal, Herman Kamper, Adam Lopez, Sharon Goldwater

Figure 1 for Towards speech-to-text translation without speech recognition
Figure 2 for Towards speech-to-text translation without speech recognition
Figure 3 for Towards speech-to-text translation without speech recognition
Figure 4 for Towards speech-to-text translation without speech recognition
Viaarxiv icon

Representation learning through cross-modal conditional teacher-student training for speech emotion recognition

Nov 30, 2021
Sundararajan Srinivasan, Zhaocheng Huang, Katrin Kirchhoff

Figure 1 for Representation learning through cross-modal conditional teacher-student training for speech emotion recognition
Figure 2 for Representation learning through cross-modal conditional teacher-student training for speech emotion recognition
Figure 3 for Representation learning through cross-modal conditional teacher-student training for speech emotion recognition
Viaarxiv icon

Integrating Source-channel and Attention-based Sequence-to-sequence Models for Speech Recognition

Sep 14, 2019
Qiujia Li, Chao Zhang, Philip C. Woodland

Figure 1 for Integrating Source-channel and Attention-based Sequence-to-sequence Models for Speech Recognition
Figure 2 for Integrating Source-channel and Attention-based Sequence-to-sequence Models for Speech Recognition
Figure 3 for Integrating Source-channel and Attention-based Sequence-to-sequence Models for Speech Recognition
Figure 4 for Integrating Source-channel and Attention-based Sequence-to-sequence Models for Speech Recognition
Viaarxiv icon

Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge

Feb 08, 2022
Fan Yu, Shiliang Zhang, Pengcheng Guo, Yihui Fu, Zhihao Du, Siqi Zheng, Weilong Huang, Lei Xie, Zheng-Hua Tan, DeLiang Wang, Yanmin Qian, Kong Aik Lee, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu

Figure 1 for Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge
Figure 2 for Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge
Figure 3 for Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge
Figure 4 for Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge
Viaarxiv icon

To Reverse the Gradient or Not: An Empirical Comparison of Adversarial and Multi-task Learning in Speech Recognition

Dec 13, 2018
Yossi Adi, Neil Zeghidour, Ronan Collobert, Nicolas Usunier, Vitaliy Liptchinsky, Gabriel Synnaeve

Figure 1 for To Reverse the Gradient or Not: An Empirical Comparison of Adversarial and Multi-task Learning in Speech Recognition
Figure 2 for To Reverse the Gradient or Not: An Empirical Comparison of Adversarial and Multi-task Learning in Speech Recognition
Figure 3 for To Reverse the Gradient or Not: An Empirical Comparison of Adversarial and Multi-task Learning in Speech Recognition
Viaarxiv icon

GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio

Add code
Bookmark button
Alert button
Jun 13, 2021
Guoguo Chen, Shuzhou Chai, Guanbo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Yujun Wang, Zhao You, Zhiyong Yan

Figure 1 for GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
Figure 2 for GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
Figure 3 for GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
Figure 4 for GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio
Viaarxiv icon

Hybrid CTC-Attention based End-to-End Speech Recognition using Subword Units

Sep 06, 2018
Zhangyu Xiao, Zhijian Ou, Wei Chu, Hui Lin

Figure 1 for Hybrid CTC-Attention based End-to-End Speech Recognition using Subword Units
Viaarxiv icon

Punctuation Restoration

Add code
Bookmark button
Alert button
Feb 19, 2022
Viet Dac Lai, Amir Pouran Ben Veyseh, Franck Dernoncourt, Thien Huu Nguyen

Figure 1 for Punctuation Restoration
Figure 2 for Punctuation Restoration
Figure 3 for Punctuation Restoration
Figure 4 for Punctuation Restoration
Viaarxiv icon