Alert button

"speech recognition": models, code, and papers
Alert button

Improving CTC-based ASR Models with Gated Interlayer Collaboration

May 25, 2022
Yuting Yang, Yuke Li, Binbin Du

Figure 1 for Improving CTC-based ASR Models with Gated Interlayer Collaboration
Figure 2 for Improving CTC-based ASR Models with Gated Interlayer Collaboration
Figure 3 for Improving CTC-based ASR Models with Gated Interlayer Collaboration
Figure 4 for Improving CTC-based ASR Models with Gated Interlayer Collaboration
Viaarxiv icon

User-friendly automatic transcription of low-resource languages: Plugging ESPnet into Elpis

Add code
Bookmark button
Alert button
Dec 15, 2020
Oliver Adams, Benjamin Galliot, Guillaume Wisniewski, Nicholas Lambourne, Ben Foley, Rahasya Sanders-Dwyer, Janet Wiles, Alexis Michaud, Séverine Guillaume, Laurent Besacier, Christopher Cox, Katya Aplonova, Guillaume Jacques, Nathan Hill

Figure 1 for User-friendly automatic transcription of low-resource languages: Plugging ESPnet into Elpis
Figure 2 for User-friendly automatic transcription of low-resource languages: Plugging ESPnet into Elpis
Figure 3 for User-friendly automatic transcription of low-resource languages: Plugging ESPnet into Elpis
Figure 4 for User-friendly automatic transcription of low-resource languages: Plugging ESPnet into Elpis
Viaarxiv icon

A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes

Apr 13, 2022
Shaojin Ding, Weiran Wang, Ding Zhao, Tara N. Sainath, Yanzhang He, Robert David, Rami Botros, Xin Wang, Rina Panigrahy, Qiao Liang, Dongseong Hwang, Ian McGraw, Rohit Prabhavalkar, Trevor Strohman

Figure 1 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 2 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 3 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Figure 4 for A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes
Viaarxiv icon

DPSNN: A Differentially Private Spiking Neural Network

Add code
Bookmark button
Alert button
May 24, 2022
Jihang Wang, Dongcheng Zhao, Guobin Shen, Qian Zhang, Yi Zeng

Figure 1 for DPSNN: A Differentially Private Spiking Neural Network
Figure 2 for DPSNN: A Differentially Private Spiking Neural Network
Figure 3 for DPSNN: A Differentially Private Spiking Neural Network
Figure 4 for DPSNN: A Differentially Private Spiking Neural Network
Viaarxiv icon

Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset

Mar 31, 2022
Zehui Yang, Yifan Chen, Lei Luo, Runyan Yang, Lingxuan Ye, Gaofeng Cheng, Ji Xu, Yaohui Jin, Qingqing Zhang, Pengyuan Zhang, Lei Xie, Yonghong Yan

Figure 1 for Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset
Figure 2 for Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset
Figure 3 for Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset
Figure 4 for Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset
Viaarxiv icon

Deep Learning for Distant Speech Recognition

Add code
Bookmark button
Alert button
Dec 17, 2017
Mirco Ravanelli

Figure 1 for Deep Learning for Distant Speech Recognition
Figure 2 for Deep Learning for Distant Speech Recognition
Figure 3 for Deep Learning for Distant Speech Recognition
Figure 4 for Deep Learning for Distant Speech Recognition
Viaarxiv icon

Large-Scale Streaming End-to-End Speech Translation with Neural Transducers

Apr 11, 2022
Jian Xue, Peidong Wang, Jinyu Li, Matt Post, Yashesh Gaur

Figure 1 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 2 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 3 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 4 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Viaarxiv icon

Invariant Representations for Noisy Speech Recognition

Nov 27, 2016
Dmitriy Serdyuk, Kartik Audhkhasi, Philémon Brakel, Bhuvana Ramabhadran, Samuel Thomas, Yoshua Bengio

Figure 1 for Invariant Representations for Noisy Speech Recognition
Figure 2 for Invariant Representations for Noisy Speech Recognition
Viaarxiv icon

Integrating Source-channel and Attention-based Sequence-to-sequence Models for Speech Recognition

Oct 01, 2019
Qiujia Li, Chao Zhang, Philip C. Woodland

Figure 1 for Integrating Source-channel and Attention-based Sequence-to-sequence Models for Speech Recognition
Figure 2 for Integrating Source-channel and Attention-based Sequence-to-sequence Models for Speech Recognition
Figure 3 for Integrating Source-channel and Attention-based Sequence-to-sequence Models for Speech Recognition
Figure 4 for Integrating Source-channel and Attention-based Sequence-to-sequence Models for Speech Recognition
Viaarxiv icon

Speech Emotion Recognition System by Quaternion Nonlinear Echo State Network

Nov 14, 2021
Fatemeh Daneshfar, Seyed Jahanshah Kabudian

Figure 1 for Speech Emotion Recognition System by Quaternion Nonlinear Echo State Network
Figure 2 for Speech Emotion Recognition System by Quaternion Nonlinear Echo State Network
Figure 3 for Speech Emotion Recognition System by Quaternion Nonlinear Echo State Network
Figure 4 for Speech Emotion Recognition System by Quaternion Nonlinear Echo State Network
Viaarxiv icon