Alert button

"speech recognition": models, code, and papers
Alert button

Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech

Sep 14, 2021
Katrin Tomanek, Vicky Zayats, Dirk Padfield, Kara Vaillancourt, Fadi Biadsy

Figure 1 for Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech
Figure 2 for Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech
Figure 3 for Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech
Figure 4 for Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech
Viaarxiv icon

Continuous Speech Separation with Recurrent Selective Attention Network

Oct 28, 2021
Yixuan Zhang, Zhuo Chen, Jian Wu, Takuya Yoshioka, Peidong Wang, Zhong Meng, Jinyu Li

Figure 1 for Continuous Speech Separation with Recurrent Selective Attention Network
Figure 2 for Continuous Speech Separation with Recurrent Selective Attention Network
Figure 3 for Continuous Speech Separation with Recurrent Selective Attention Network
Figure 4 for Continuous Speech Separation with Recurrent Selective Attention Network
Viaarxiv icon

Wav2Vec2.0 on the Edge: Performance Evaluation

Feb 12, 2022
Santosh Gondi

Figure 1 for Wav2Vec2.0 on the Edge: Performance Evaluation
Figure 2 for Wav2Vec2.0 on the Edge: Performance Evaluation
Figure 3 for Wav2Vec2.0 on the Edge: Performance Evaluation
Figure 4 for Wav2Vec2.0 on the Edge: Performance Evaluation
Viaarxiv icon

SpliceOut: A Simple and Efficient Audio Augmentation Method

Add code
Bookmark button
Alert button
Oct 13, 2021
Arjit Jain, Pranay Reddy Samala, Deepak Mittal, Preethi Jyoti, Maneesh Singh

Figure 1 for SpliceOut: A Simple and Efficient Audio Augmentation Method
Figure 2 for SpliceOut: A Simple and Efficient Audio Augmentation Method
Figure 3 for SpliceOut: A Simple and Efficient Audio Augmentation Method
Figure 4 for SpliceOut: A Simple and Efficient Audio Augmentation Method
Viaarxiv icon

LightSeq2: Accelerated Training for Transformer-based Models on GPUs

Add code
Bookmark button
Alert button
Oct 27, 2021
Xiaohui Wang, Ying Xiong, Xian Qian, Yang Wei, Lei Li, Mingxuan Wang

Figure 1 for LightSeq2: Accelerated Training for Transformer-based Models on GPUs
Figure 2 for LightSeq2: Accelerated Training for Transformer-based Models on GPUs
Figure 3 for LightSeq2: Accelerated Training for Transformer-based Models on GPUs
Figure 4 for LightSeq2: Accelerated Training for Transformer-based Models on GPUs
Viaarxiv icon

Neural Dependency Coding inspired Multimodal Fusion

Sep 28, 2021
Shiv Shankar

Figure 1 for Neural Dependency Coding inspired Multimodal Fusion
Figure 2 for Neural Dependency Coding inspired Multimodal Fusion
Viaarxiv icon

Spatial Diffuseness Features for DNN-Based Speech Recognition in Noisy and Reverberant Environments

Feb 16, 2015
Andreas Schwarz, Christian Huemmer, Roland Maas, Walter Kellermann

Figure 1 for Spatial Diffuseness Features for DNN-Based Speech Recognition in Noisy and Reverberant Environments
Figure 2 for Spatial Diffuseness Features for DNN-Based Speech Recognition in Noisy and Reverberant Environments
Figure 3 for Spatial Diffuseness Features for DNN-Based Speech Recognition in Noisy and Reverberant Environments
Viaarxiv icon

Subject Envelope based Multitype Reconstruction Algorithm of Speech Samples of Parkinson's Disease

Aug 23, 2021
Yongming Li, Chengyu Liu, Pin Wang, Hehua Zhang, Anhai Wei

Figure 1 for Subject Envelope based Multitype Reconstruction Algorithm of Speech Samples of Parkinson's Disease
Figure 2 for Subject Envelope based Multitype Reconstruction Algorithm of Speech Samples of Parkinson's Disease
Figure 3 for Subject Envelope based Multitype Reconstruction Algorithm of Speech Samples of Parkinson's Disease
Figure 4 for Subject Envelope based Multitype Reconstruction Algorithm of Speech Samples of Parkinson's Disease
Viaarxiv icon

Lhotse: a speech data representation library for the modern deep learning ecosystem

Add code
Bookmark button
Alert button
Oct 25, 2021
Piotr Żelasko, Daniel Povey, Jan "Yenda" Trmal, Sanjeev Khudanpur

Figure 1 for Lhotse: a speech data representation library for the modern deep learning ecosystem
Figure 2 for Lhotse: a speech data representation library for the modern deep learning ecosystem
Viaarxiv icon

Deep Speech: Scaling up end-to-end speech recognition

Add code
Bookmark button
Alert button
Dec 19, 2014
Awni Hannun, Carl Case, Jared Casper, Bryan Catanzaro, Greg Diamos, Erich Elsen, Ryan Prenger, Sanjeev Satheesh, Shubho Sengupta, Adam Coates, Andrew Y. Ng

Figure 1 for Deep Speech: Scaling up end-to-end speech recognition
Figure 2 for Deep Speech: Scaling up end-to-end speech recognition
Figure 3 for Deep Speech: Scaling up end-to-end speech recognition
Figure 4 for Deep Speech: Scaling up end-to-end speech recognition
Viaarxiv icon