Alert button

"speech recognition": models, code, and papers
Alert button

Large scale weakly and semi-supervised learning for low-resource video ASR

May 16, 2020
Kritika Singh, Vimal Manohar, Alex Xiao, Sergey Edunov, Ross Girshick, Vitaliy Liptchinsky, Christian Fuegen, Yatharth Saraf, Geoffrey Zweig, Abdelrahman Mohamed

Figure 1 for Large scale weakly and semi-supervised learning for low-resource video ASR
Figure 2 for Large scale weakly and semi-supervised learning for low-resource video ASR
Figure 3 for Large scale weakly and semi-supervised learning for low-resource video ASR
Viaarxiv icon

Load-balanced Gather-scatter Patterns for Sparse Deep Neural Networks

Add code
Bookmark button
Alert button
Dec 20, 2021
Fei Sun, Minghai Qin, Tianyun Zhang, Xiaolong Ma, Haoran Li, Junwen Luo, Zihao Zhao, Yen-Kuang Chen, Yuan Xie

Figure 1 for Load-balanced Gather-scatter Patterns for Sparse Deep Neural Networks
Figure 2 for Load-balanced Gather-scatter Patterns for Sparse Deep Neural Networks
Figure 3 for Load-balanced Gather-scatter Patterns for Sparse Deep Neural Networks
Figure 4 for Load-balanced Gather-scatter Patterns for Sparse Deep Neural Networks
Viaarxiv icon

Speaker Diarization with Lexical Information

Apr 13, 2020
Tae Jin Park, Kyu J. Han, Jing Huang, Xiaodong He, Bowen Zhou, Panayiotis Georgiou, Shrikanth Narayanan

Figure 1 for Speaker Diarization with Lexical Information
Figure 2 for Speaker Diarization with Lexical Information
Figure 3 for Speaker Diarization with Lexical Information
Figure 4 for Speaker Diarization with Lexical Information
Viaarxiv icon

Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates

Sep 27, 2021
Hirofumi Inaguma, Siddharth Dalmia, Brian Yan, Shinji Watanabe

Figure 1 for Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates
Figure 2 for Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates
Figure 3 for Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates
Figure 4 for Fast-MD: Fast Multi-Decoder End-to-End Speech Translation with Non-Autoregressive Hidden Intermediates
Viaarxiv icon

NeMo Inverse Text Normalization: From Development To Production

Add code
Bookmark button
Alert button
Apr 11, 2021
Yang Zhang, Evelina Bakhturina, Kyle Gorman, Boris Ginsburg

Figure 1 for NeMo Inverse Text Normalization: From Development To Production
Figure 2 for NeMo Inverse Text Normalization: From Development To Production
Figure 3 for NeMo Inverse Text Normalization: From Development To Production
Figure 4 for NeMo Inverse Text Normalization: From Development To Production
Viaarxiv icon

End-to-end Named Entity Recognition from English Speech

Add code
Bookmark button
Alert button
May 22, 2020
Hemant Yadav, Sreyan Ghosh, Yi Yu, Rajiv Ratn Shah

Figure 1 for End-to-end Named Entity Recognition from English Speech
Figure 2 for End-to-end Named Entity Recognition from English Speech
Figure 3 for End-to-end Named Entity Recognition from English Speech
Figure 4 for End-to-end Named Entity Recognition from English Speech
Viaarxiv icon

Sequence-to-Sequence Modeling for Action Identification at High Temporal Resolution

Nov 03, 2021
Aakash Kaku, Kangning Liu, Avinash Parnandi, Haresh Rengaraj Rajamohan, Kannan Venkataramanan, Anita Venkatesan, Audre Wirtanen, Natasha Pandit, Heidi Schambra, Carlos Fernandez-Granda

Figure 1 for Sequence-to-Sequence Modeling for Action Identification at High Temporal Resolution
Figure 2 for Sequence-to-Sequence Modeling for Action Identification at High Temporal Resolution
Figure 3 for Sequence-to-Sequence Modeling for Action Identification at High Temporal Resolution
Figure 4 for Sequence-to-Sequence Modeling for Action Identification at High Temporal Resolution
Viaarxiv icon

Topic Classification on Spoken Documents Using Deep Acoustic and Linguistic Features

Jun 16, 2021
Tan Liu, Wu Guo, Bin Gu

Figure 1 for Topic Classification on Spoken Documents Using Deep Acoustic and Linguistic Features
Figure 2 for Topic Classification on Spoken Documents Using Deep Acoustic and Linguistic Features
Figure 3 for Topic Classification on Spoken Documents Using Deep Acoustic and Linguistic Features
Figure 4 for Topic Classification on Spoken Documents Using Deep Acoustic and Linguistic Features
Viaarxiv icon

Towards Transferable Speech Emotion Representation: On loss functions for cross-lingual latent representations

Mar 28, 2022
Sneha Das, Nicole Nadine Lønfeldt, Anne Katrine Pagsberg, Line H. Clemmensen

Figure 1 for Towards Transferable Speech Emotion Representation: On loss functions for cross-lingual latent representations
Figure 2 for Towards Transferable Speech Emotion Representation: On loss functions for cross-lingual latent representations
Figure 3 for Towards Transferable Speech Emotion Representation: On loss functions for cross-lingual latent representations
Figure 4 for Towards Transferable Speech Emotion Representation: On loss functions for cross-lingual latent representations
Viaarxiv icon

Noisy Training Improves E2E ASR for the Edge

Jul 09, 2021
Dilin Wang, Yuan Shangguan, Haichuan Yang, Pierce Chuang, Jiatong Zhou, Meng Li, Ganesh Venkatesh, Ozlem Kalinli, Vikas Chandra

Figure 1 for Noisy Training Improves E2E ASR for the Edge
Figure 2 for Noisy Training Improves E2E ASR for the Edge
Figure 3 for Noisy Training Improves E2E ASR for the Edge
Figure 4 for Noisy Training Improves E2E ASR for the Edge
Viaarxiv icon