Alert button

"speech recognition": models, code, and papers
Alert button

Semi-Supervised Speech Recognition via Graph-based Temporal Classification

Oct 29, 2020
Niko Moritz, Takaaki Hori, Jonathan Le Roux

Figure 1 for Semi-Supervised Speech Recognition via Graph-based Temporal Classification
Figure 2 for Semi-Supervised Speech Recognition via Graph-based Temporal Classification
Figure 3 for Semi-Supervised Speech Recognition via Graph-based Temporal Classification
Figure 4 for Semi-Supervised Speech Recognition via Graph-based Temporal Classification
Viaarxiv icon

Speech Recognition by Machine, A Review

Jan 13, 2010
M. A. Anusuya, S. K. Katti

Figure 1 for Speech Recognition by Machine, A Review
Figure 2 for Speech Recognition by Machine, A Review
Figure 3 for Speech Recognition by Machine, A Review
Figure 4 for Speech Recognition by Machine, A Review
Viaarxiv icon

End-to-End Visual Speech Recognition for Small-Scale Datasets

Apr 02, 2019
Stavros Petridis, Yujiang Wang, Pingchuan Ma, Zuwei Li, Maja Pantic

Figure 1 for End-to-End Visual Speech Recognition for Small-Scale Datasets
Figure 2 for End-to-End Visual Speech Recognition for Small-Scale Datasets
Figure 3 for End-to-End Visual Speech Recognition for Small-Scale Datasets
Figure 4 for End-to-End Visual Speech Recognition for Small-Scale Datasets
Viaarxiv icon

Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition

Aug 16, 2022
Andrei Andrusenko, Rauf Nasretdinov, Aleksei Romanenko

Figure 1 for Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
Figure 2 for Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
Figure 3 for Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
Figure 4 for Uconv-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition
Viaarxiv icon

Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget

Jun 15, 2021
Lukas Drude, Jahn Heymann, Andreas Schwarz, Jean-Marc Valin

Figure 1 for Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget
Figure 2 for Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget
Figure 3 for Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget
Figure 4 for Multi-channel Opus compression for far-field automatic speech recognition with a fixed bitrate budget
Viaarxiv icon

A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation

Sep 14, 2022
Tom O'Malley, Arun Narayanan, Quan Wang

Figure 1 for A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation
Figure 2 for A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation
Figure 3 for A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation
Figure 4 for A Universally-Deployable ASR Frontend for Joint Acoustic Echo Cancellation, Speech Enhancement, and Voice Separation
Viaarxiv icon

Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition

Jun 08, 2021
Max W. Y. Lam, Jun Wang, Chao Weng, Dan Su, Dong Yu

Figure 1 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Figure 2 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Figure 3 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Figure 4 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Viaarxiv icon

Phoneme-Based Persian Speech Recognition

Jan 15, 2019
Saber Malekzadeh

Viaarxiv icon

Continuous Pseudo-Labeling from the Start

Oct 17, 2022
Dan Berrebbi, Ronan Collobert, Samy Bengio, Navdeep Jaitly, Tatiana Likhomanenko

Figure 1 for Continuous Pseudo-Labeling from the Start
Figure 2 for Continuous Pseudo-Labeling from the Start
Figure 3 for Continuous Pseudo-Labeling from the Start
Figure 4 for Continuous Pseudo-Labeling from the Start
Viaarxiv icon

FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech

May 25, 2022
Alexis Conneau, Min Ma, Simran Khanuja, Yu Zhang, Vera Axelrod, Siddharth Dalmia, Jason Riesa, Clara Rivera, Ankur Bapna

Figure 1 for FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Figure 2 for FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Figure 3 for FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Figure 4 for FLEURS: Few-shot Learning Evaluation of Universal Representations of Speech
Viaarxiv icon