Alert button

"speech recognition": models, code, and papers
Alert button

Localization Based Sequential Grouping for Continuous Speech Separation

Add code
Bookmark button
Alert button
Jul 14, 2021
Zhong-Qiu Wang, DeLiang Wang

Figure 1 for Localization Based Sequential Grouping for Continuous Speech Separation
Figure 2 for Localization Based Sequential Grouping for Continuous Speech Separation
Figure 3 for Localization Based Sequential Grouping for Continuous Speech Separation
Viaarxiv icon

On Prosody Modeling for ASR+TTS based Voice Conversion

Add code
Bookmark button
Alert button
Jul 20, 2021
Wen-Chin Huang, Tomoki Hayashi, Xinjian Li, Shinji Watanabe, Tomoki Toda

Figure 1 for On Prosody Modeling for ASR+TTS based Voice Conversion
Figure 2 for On Prosody Modeling for ASR+TTS based Voice Conversion
Figure 3 for On Prosody Modeling for ASR+TTS based Voice Conversion
Figure 4 for On Prosody Modeling for ASR+TTS based Voice Conversion
Viaarxiv icon

DEXTER: Deep Encoding of External Knowledge for Named Entity Recognition in Virtual Assistants

Aug 15, 2021
Deepak Muralidharan, Joel Ruben Antony Moniz, Weicheng Zhang, Stephen Pulman, Lin Li, Megan Barnes, Jingjing Pan, Jason Williams, Alex Acero

Figure 1 for DEXTER: Deep Encoding of External Knowledge for Named Entity Recognition in Virtual Assistants
Figure 2 for DEXTER: Deep Encoding of External Knowledge for Named Entity Recognition in Virtual Assistants
Figure 3 for DEXTER: Deep Encoding of External Knowledge for Named Entity Recognition in Virtual Assistants
Figure 4 for DEXTER: Deep Encoding of External Knowledge for Named Entity Recognition in Virtual Assistants
Viaarxiv icon

Detecting Audio Adversarial Examples with Logit Noising

Add code
Bookmark button
Alert button
Dec 13, 2021
Namgyu Park, Sangwoo Ji, Jong Kim

Figure 1 for Detecting Audio Adversarial Examples with Logit Noising
Figure 2 for Detecting Audio Adversarial Examples with Logit Noising
Figure 3 for Detecting Audio Adversarial Examples with Logit Noising
Figure 4 for Detecting Audio Adversarial Examples with Logit Noising
Viaarxiv icon

CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition

Jun 14, 2021
Rupak Vignesh Swaminathan, Brian King, Grant P. Strimel, Jasha Droppo, Athanasios Mouchtaris

Figure 1 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Figure 2 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Figure 3 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Figure 4 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Viaarxiv icon

Towards Resistant Audio Adversarial Examples

Add code
Bookmark button
Alert button
Oct 14, 2020
Tom Dörr, Karla Markert, Nicolas M. Müller, Konstantin Böttinger

Figure 1 for Towards Resistant Audio Adversarial Examples
Figure 2 for Towards Resistant Audio Adversarial Examples
Figure 3 for Towards Resistant Audio Adversarial Examples
Figure 4 for Towards Resistant Audio Adversarial Examples
Viaarxiv icon

A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition

Mar 29, 2017
Albert Zeyer, Patrick Doetsch, Paul Voigtlaender, Ralf Schlüter, Hermann Ney

Figure 1 for A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition
Figure 2 for A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition
Figure 3 for A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition
Figure 4 for A Comprehensive Study of Deep Bidirectional LSTM RNNs for Acoustic Modeling in Speech Recognition
Viaarxiv icon

End to End ASR System with Automatic Punctuation Insertion

Dec 03, 2020
Yushi Guan

Figure 1 for End to End ASR System with Automatic Punctuation Insertion
Figure 2 for End to End ASR System with Automatic Punctuation Insertion
Figure 3 for End to End ASR System with Automatic Punctuation Insertion
Figure 4 for End to End ASR System with Automatic Punctuation Insertion
Viaarxiv icon

Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR

Aug 26, 2021
Fu-An Chao, Jeih-weih Hung, Berlin Chen

Figure 1 for Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR
Figure 2 for Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR
Figure 3 for Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR
Figure 4 for Cross-domain Single-channel Speech Enhancement Model with Bi-projection Fusion Module for Noise-robust ASR
Viaarxiv icon

MT3: Multi-Task Multitrack Music Transcription

Add code
Bookmark button
Alert button
Nov 04, 2021
Josh Gardner, Ian Simon, Ethan Manilow, Curtis Hawthorne, Jesse Engel

Figure 1 for MT3: Multi-Task Multitrack Music Transcription
Figure 2 for MT3: Multi-Task Multitrack Music Transcription
Figure 3 for MT3: Multi-Task Multitrack Music Transcription
Figure 4 for MT3: Multi-Task Multitrack Music Transcription
Viaarxiv icon