Alert button

"speech recognition": models, code, and papers
Alert button

Conversational Speech Recognition by Learning Audio-textual Cross-modal Contextual Representation

Add code
Bookmark button
Alert button
Oct 22, 2023
Kun Wei, Bei Li, Hang Lv, Quan Lu, Ning Jiang, Lei Xie

Viaarxiv icon

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch

Add code
Bookmark button
Alert button
Oct 27, 2023
Jeff Hwang, Moto Hira, Caroline Chen, Xiaohui Zhang, Zhaoheng Ni, Guangzhi Sun, Pingchuan Ma, Ruizhe Huang, Vineel Pratap, Yuekai Zhang, Anurag Kumar, Chin-Yun Yu, Chuang Zhu, Chunxi Liu, Jacob Kahn, Mirco Ravanelli, Peng Sun, Shinji Watanabe, Yangyang Shi, Yumeng Tao, Robin Scheibler, Samuele Cornell, Sean Kim, Stavros Petridis

Figure 1 for TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
Figure 2 for TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
Figure 3 for TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
Figure 4 for TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch
Viaarxiv icon

Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition

Nov 17, 2023
Qijie Shao, Pengcheng Guo, Jinghao Yan, Pengfei Hu, Lei Xie

Viaarxiv icon

TemporalAugmenter: An Ensemble Recurrent Based Deep Learning Approach for Signal Classification

Jan 13, 2024
Nelly Elsayed, Constantinos L. Zekios, Navid Asadizanjani, Zag ElSayed

Viaarxiv icon

Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition

Sep 14, 2023
Yang Li, Liangzhen Lai, Yuan Shangguan, Forrest N. Iandola, Ernie Chang, Yangyang Shi, Vikas Chandra

Figure 1 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Figure 2 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Figure 3 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Figure 4 for Folding Attention: Memory and Power Optimization for On-Device Transformer-based Streaming Speech Recognition
Viaarxiv icon

Generative Context-aware Fine-tuning of Self-supervised Speech Models

Dec 15, 2023
Suwon Shon, Kwangyoun Kim, Prashant Sridhar, Yi-Te Hsu, Shinji Watanabe, Karen Livescu

Viaarxiv icon

Neural Language Model Pruning for Automatic Speech Recognition

Oct 05, 2023
Leonardo Emili, Thiago Fraga-Silva, Ernest Pusateri, Markus Nußbaum-Thom, Youssef Oualil

Figure 1 for Neural Language Model Pruning for Automatic Speech Recognition
Figure 2 for Neural Language Model Pruning for Automatic Speech Recognition
Figure 3 for Neural Language Model Pruning for Automatic Speech Recognition
Figure 4 for Neural Language Model Pruning for Automatic Speech Recognition
Viaarxiv icon

Towards Domain-Specific Cross-Corpus Speech Emotion Recognition Approach

Dec 11, 2023
Yan Zhao, Yuan Zong, Hailun Lian, Cheng Lu, Jingang Shi, Wenming Zheng

Viaarxiv icon

Zipformer: A faster and better encoder for automatic speech recognition

Add code
Bookmark button
Alert button
Oct 17, 2023
Zengwei Yao, Liyong Guo, Xiaoyu Yang, Wei Kang, Fangjun Kuang, Yifan Yang, Zengrui Jin, Long Lin, Daniel Povey

Viaarxiv icon

SALM: Speech-augmented Language Model with In-context Learning for Speech Recognition and Translation

Add code
Bookmark button
Alert button
Oct 13, 2023
Zhehuai Chen, He Huang, Andrei Andrusenko, Oleksii Hrinchuk, Krishna C. Puvvada, Jason Li, Subhankar Ghosh, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon