Alert button

"speech": models, code, and papers
Alert button

A New 27 Class Sign Language Dataset Collected from 173 Individuals

Mar 08, 2022
Arda Mavi, Zeynep Dikle

Figure 1 for A New 27 Class Sign Language Dataset Collected from 173 Individuals
Viaarxiv icon

Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Jun 08, 2021
Max W. Y. Lam, Jun Wang, Chao Weng, Dan Su, Dong Yu

Figure 1 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Figure 2 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Figure 3 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Figure 4 for Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition
Viaarxiv icon

Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models

Add code
Bookmark button
Alert button
May 03, 2021
Coleman Hooper, Thierry Tambe, Gu-Yeon Wei

Figure 1 for Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models
Figure 2 for Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models
Figure 3 for Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models
Figure 4 for Quantifying and Maximizing the Benefits of Back-End Noise Adaption on Attention-Based Speech Recognition Models
Viaarxiv icon

Transformer with Bidirectional Decoder for Speech Recognition

Aug 11, 2020
Xi Chen, Songyang Zhang, Dandan Song, Peng Ouyang, Shouyi Yin

Figure 1 for Transformer with Bidirectional Decoder for Speech Recognition
Figure 2 for Transformer with Bidirectional Decoder for Speech Recognition
Figure 3 for Transformer with Bidirectional Decoder for Speech Recognition
Figure 4 for Transformer with Bidirectional Decoder for Speech Recognition
Viaarxiv icon

CarneliNet: Neural Mixture Model for Automatic Speech Recognition

Add code
Bookmark button
Alert button
Jul 22, 2021
Aleksei Kalinov, Somshubra Majumdar, Jagadeesh Balam, Boris Ginsburg

Figure 1 for CarneliNet: Neural Mixture Model for Automatic Speech Recognition
Figure 2 for CarneliNet: Neural Mixture Model for Automatic Speech Recognition
Figure 3 for CarneliNet: Neural Mixture Model for Automatic Speech Recognition
Figure 4 for CarneliNet: Neural Mixture Model for Automatic Speech Recognition
Viaarxiv icon

Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling

Aug 09, 2020
Yeunju Choi, Youngmoon Jung, Hoirin Kim

Figure 1 for Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling
Figure 2 for Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling
Figure 3 for Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling
Figure 4 for Deep MOS Predictor for Synthetic Speech Using Cluster-Based Modeling
Viaarxiv icon

Fine-Grained Grounding for Multimodal Speech Recognition

Add code
Bookmark button
Alert button
Oct 05, 2020
Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott

Figure 1 for Fine-Grained Grounding for Multimodal Speech Recognition
Figure 2 for Fine-Grained Grounding for Multimodal Speech Recognition
Figure 3 for Fine-Grained Grounding for Multimodal Speech Recognition
Figure 4 for Fine-Grained Grounding for Multimodal Speech Recognition
Viaarxiv icon

Disentangled Speaker Representation Learning via Mutual Information Minimization

Add code
Bookmark button
Alert button
Aug 17, 2022
Sung Hwan Mun, Min Hyun Han, Minchan Kim, Dongjune Lee, Nam Soo Kim

Figure 1 for Disentangled Speaker Representation Learning via Mutual Information Minimization
Figure 2 for Disentangled Speaker Representation Learning via Mutual Information Minimization
Figure 3 for Disentangled Speaker Representation Learning via Mutual Information Minimization
Figure 4 for Disentangled Speaker Representation Learning via Mutual Information Minimization
Viaarxiv icon

Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model

Add code
Bookmark button
Alert button
Jan 06, 2022
Jinchuan Tian, Jianwei Yu, Chao Weng, Yuexian Zou, Dong Yu

Figure 1 for Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model
Figure 2 for Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model
Figure 3 for Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model
Figure 4 for Improving Mandarin End-to-End Speech Recognition with Word N-gram Language Model
Viaarxiv icon

Multi-Channel Speech Enhancement using Graph Neural Networks

Feb 13, 2021
Panagiotis Tzirakis, Anurag Kumar, Jacob Donley

Figure 1 for Multi-Channel Speech Enhancement using Graph Neural Networks
Figure 2 for Multi-Channel Speech Enhancement using Graph Neural Networks
Figure 3 for Multi-Channel Speech Enhancement using Graph Neural Networks
Figure 4 for Multi-Channel Speech Enhancement using Graph Neural Networks
Viaarxiv icon