Alert button

"speech recognition": models, code, and papers
Alert button

Towards Identity Preserving Normal to Dysarthric Voice Conversion

Oct 15, 2021
Wen-Chin Huang, Bence Mark Halpern, Lester Phillip Violeta, Odette Scharenborg, Tomoki Toda

Figure 1 for Towards Identity Preserving Normal to Dysarthric Voice Conversion
Figure 2 for Towards Identity Preserving Normal to Dysarthric Voice Conversion
Figure 3 for Towards Identity Preserving Normal to Dysarthric Voice Conversion
Figure 4 for Towards Identity Preserving Normal to Dysarthric Voice Conversion
Viaarxiv icon

Improving Punctuation Restoration for Speech Transcripts via External Data

Oct 01, 2021
Xue-Yong Fu, Cheng Chen, Md Tahmid Rahman Laskar, Shashi Bhushan TN, Simon Corston-Oliver

Figure 1 for Improving Punctuation Restoration for Speech Transcripts via External Data
Figure 2 for Improving Punctuation Restoration for Speech Transcripts via External Data
Figure 3 for Improving Punctuation Restoration for Speech Transcripts via External Data
Figure 4 for Improving Punctuation Restoration for Speech Transcripts via External Data
Viaarxiv icon

Continuous Speech Separation with Recurrent Selective Attention Network

Oct 28, 2021
Yixuan Zhang, Zhuo Chen, Jian Wu, Takuya Yoshioka, Peidong Wang, Zhong Meng, Jinyu Li

Figure 1 for Continuous Speech Separation with Recurrent Selective Attention Network
Figure 2 for Continuous Speech Separation with Recurrent Selective Attention Network
Figure 3 for Continuous Speech Separation with Recurrent Selective Attention Network
Figure 4 for Continuous Speech Separation with Recurrent Selective Attention Network
Viaarxiv icon

LightSeq2: Accelerated Training for Transformer-based Models on GPUs

Oct 27, 2021
Xiaohui Wang, Ying Xiong, Xian Qian, Yang Wei, Lei Li, Mingxuan Wang

Figure 1 for LightSeq2: Accelerated Training for Transformer-based Models on GPUs
Figure 2 for LightSeq2: Accelerated Training for Transformer-based Models on GPUs
Figure 3 for LightSeq2: Accelerated Training for Transformer-based Models on GPUs
Figure 4 for LightSeq2: Accelerated Training for Transformer-based Models on GPUs
Viaarxiv icon

Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech

Sep 14, 2021
Katrin Tomanek, Vicky Zayats, Dirk Padfield, Kara Vaillancourt, Fadi Biadsy

Figure 1 for Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech
Figure 2 for Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech
Figure 3 for Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech
Figure 4 for Residual Adapters for Parameter-Efficient ASR Adaptation to Atypical and Accented Speech
Viaarxiv icon

SpliceOut: A Simple and Efficient Audio Augmentation Method

Oct 13, 2021
Arjit Jain, Pranay Reddy Samala, Deepak Mittal, Preethi Jyoti, Maneesh Singh

Figure 1 for SpliceOut: A Simple and Efficient Audio Augmentation Method
Figure 2 for SpliceOut: A Simple and Efficient Audio Augmentation Method
Figure 3 for SpliceOut: A Simple and Efficient Audio Augmentation Method
Figure 4 for SpliceOut: A Simple and Efficient Audio Augmentation Method
Viaarxiv icon

Learning Transferable Features for Speech Emotion Recognition

Dec 23, 2019
Alison Marczewski, Adriano Veloso, Nívio Ziviani

Figure 1 for Learning Transferable Features for Speech Emotion Recognition
Figure 2 for Learning Transferable Features for Speech Emotion Recognition
Figure 3 for Learning Transferable Features for Speech Emotion Recognition
Figure 4 for Learning Transferable Features for Speech Emotion Recognition
Viaarxiv icon

Neural Dependency Coding inspired Multimodal Fusion

Sep 28, 2021
Shiv Shankar

Figure 1 for Neural Dependency Coding inspired Multimodal Fusion
Figure 2 for Neural Dependency Coding inspired Multimodal Fusion
Viaarxiv icon

A.I. based Embedded Speech to Text Using Deepspeech

Feb 25, 2020
Muhammad Hafidh Firmansyah, Anand Paul, Deblina Bhattacharya, Gul Malik Urfa

Figure 1 for A.I. based Embedded Speech to Text Using Deepspeech
Figure 2 for A.I. based Embedded Speech to Text Using Deepspeech
Figure 3 for A.I. based Embedded Speech to Text Using Deepspeech
Figure 4 for A.I. based Embedded Speech to Text Using Deepspeech
Viaarxiv icon

Lhotse: a speech data representation library for the modern deep learning ecosystem

Oct 25, 2021
Piotr Żelasko, Daniel Povey, Jan "Yenda" Trmal, Sanjeev Khudanpur

Figure 1 for Lhotse: a speech data representation library for the modern deep learning ecosystem
Figure 2 for Lhotse: a speech data representation library for the modern deep learning ecosystem
Viaarxiv icon