Picture for Jasha Droppo

Jasha Droppo

CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition

Add code
Jun 14, 2021
Figure 1 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Figure 2 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Figure 3 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Figure 4 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Viaarxiv icon

Scaling Laws for Acoustic Models

Add code
Jun 11, 2021
Figure 1 for Scaling Laws for Acoustic Models
Figure 2 for Scaling Laws for Acoustic Models
Figure 3 for Scaling Laws for Acoustic Models
Figure 4 for Scaling Laws for Acoustic Models
Viaarxiv icon

Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows

Add code
Jun 10, 2021
Figure 1 for Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows
Figure 2 for Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows
Figure 3 for Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows
Figure 4 for Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows
Viaarxiv icon

Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition

Add code
May 14, 2021
Figure 1 for Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Figure 2 for Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Figure 3 for Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Figure 4 for Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Viaarxiv icon

Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End

Add code
May 14, 2021
Figure 1 for Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Figure 2 for Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Figure 3 for Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Figure 4 for Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Viaarxiv icon

Wav2vec-C: A Self-supervised Model for Speech Representation Learning

Add code
Mar 09, 2021
Figure 1 for Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Figure 2 for Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Figure 3 for Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Figure 4 for Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Viaarxiv icon

Do as I mean, not as I say: Sequence Loss Training for Spoken Language Understanding

Add code
Feb 12, 2021
Figure 1 for Do as I mean, not as I say: Sequence Loss Training for Spoken Language Understanding
Figure 2 for Do as I mean, not as I say: Sequence Loss Training for Spoken Language Understanding
Figure 3 for Do as I mean, not as I say: Sequence Loss Training for Spoken Language Understanding
Viaarxiv icon

Detection of Lexical Stress Errors in Non-native English with Data Augmentation and Attention

Add code
Dec 29, 2020
Figure 1 for Detection of Lexical Stress Errors in Non-native  English with Data Augmentation and Attention
Figure 2 for Detection of Lexical Stress Errors in Non-native  English with Data Augmentation and Attention
Figure 3 for Detection of Lexical Stress Errors in Non-native  English with Data Augmentation and Attention
Figure 4 for Detection of Lexical Stress Errors in Non-native  English with Data Augmentation and Attention
Viaarxiv icon

Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition

Add code
Jul 27, 2020
Figure 1 for Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition
Figure 2 for Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition
Figure 3 for Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition
Figure 4 for Efficient minimum word error rate training of RNN-Transducer for end-to-end speech recognition
Viaarxiv icon

Acoustic-To-Word Model Without OOV

Add code
Nov 28, 2017
Figure 1 for Acoustic-To-Word Model Without OOV
Figure 2 for Acoustic-To-Word Model Without OOV
Figure 3 for Acoustic-To-Word Model Without OOV
Figure 4 for Acoustic-To-Word Model Without OOV
Viaarxiv icon