Alert button
Picture for Jasha Droppo

Jasha Droppo

Alert button

Do You Listen with One or Two Microphones? A Unified ASR Model for Single and Multi-Channel Audio

Add code
Bookmark button
Alert button
Jun 28, 2021
Gokce Keskin, Minhua Wu, Brian King, Harish Mallidi, Yang Gao, Jasha Droppo, Ariya Rastrow, Roland Maas

Figure 1 for Do You Listen with One or Two Microphones? A Unified ASR Model for Single and Multi-Channel Audio
Figure 2 for Do You Listen with One or Two Microphones? A Unified ASR Model for Single and Multi-Channel Audio
Figure 3 for Do You Listen with One or Two Microphones? A Unified ASR Model for Single and Multi-Channel Audio
Figure 4 for Do You Listen with One or Two Microphones? A Unified ASR Model for Single and Multi-Channel Audio
Viaarxiv icon

SynthASR: Unlocking Synthetic Data for Speech Recognition

Add code
Bookmark button
Alert button
Jun 14, 2021
Amin Fazel, Wei Yang, Yulan Liu, Roberto Barra-Chicote, Yixiong Meng, Roland Maas, Jasha Droppo

Figure 1 for SynthASR: Unlocking Synthetic Data for Speech Recognition
Figure 2 for SynthASR: Unlocking Synthetic Data for Speech Recognition
Figure 3 for SynthASR: Unlocking Synthetic Data for Speech Recognition
Figure 4 for SynthASR: Unlocking Synthetic Data for Speech Recognition
Viaarxiv icon

CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition

Add code
Bookmark button
Alert button
Jun 14, 2021
Rupak Vignesh Swaminathan, Brian King, Grant P. Strimel, Jasha Droppo, Athanasios Mouchtaris

Figure 1 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Figure 2 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Figure 3 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Figure 4 for CoDERT: Distilling Encoder Representations with Co-learning for Transducer-based Speech Recognition
Viaarxiv icon

Scaling Laws for Acoustic Models

Add code
Bookmark button
Alert button
Jun 11, 2021
Jasha Droppo, Oguz Elibol

Figure 1 for Scaling Laws for Acoustic Models
Figure 2 for Scaling Laws for Acoustic Models
Figure 3 for Scaling Laws for Acoustic Models
Figure 4 for Scaling Laws for Acoustic Models
Viaarxiv icon

Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows

Add code
Bookmark button
Alert button
Jun 10, 2021
Iván Vallés-Pérez, Julian Roth, Grzegorz Beringer, Roberto Barra-Chicote, Jasha Droppo

Figure 1 for Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows
Figure 2 for Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows
Figure 3 for Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows
Figure 4 for Improving multi-speaker TTS prosody variance with a residual encoder and normalizing flows
Viaarxiv icon

Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition

Add code
Bookmark button
Alert button
May 14, 2021
Bhargav Pulugundla, Yang Gao, Brian King, Gokce Keskin, Harish Mallidi, Minhua Wu, Jasha Droppo, Roland Maas

Figure 1 for Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Figure 2 for Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Figure 3 for Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Figure 4 for Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Viaarxiv icon

Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End

Add code
Bookmark button
Alert button
May 14, 2021
Swayambhu Nath Ray, Minhua Wu, Anirudh Raju, Pegah Ghahremani, Raghavendra Bilgi, Milind Rao, Harish Arsikere, Ariya Rastrow, Andreas Stolcke, Jasha Droppo

Figure 1 for Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Figure 2 for Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Figure 3 for Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Figure 4 for Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End
Viaarxiv icon

Wav2vec-C: A Self-supervised Model for Speech Representation Learning

Add code
Bookmark button
Alert button
Mar 09, 2021
Samik Sadhu, Di He, Che-Wei Huang, Sri Harish Mallidi, Minhua Wu, Ariya Rastrow, Andreas Stolcke, Jasha Droppo, Roland Maas

Figure 1 for Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Figure 2 for Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Figure 3 for Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Figure 4 for Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Viaarxiv icon

Do as I mean, not as I say: Sequence Loss Training for Spoken Language Understanding

Add code
Bookmark button
Alert button
Feb 12, 2021
Milind Rao, Pranav Dheram, Gautam Tiwari, Anirudh Raju, Jasha Droppo, Ariya Rastrow, Andreas Stolcke

Figure 1 for Do as I mean, not as I say: Sequence Loss Training for Spoken Language Understanding
Figure 2 for Do as I mean, not as I say: Sequence Loss Training for Spoken Language Understanding
Figure 3 for Do as I mean, not as I say: Sequence Loss Training for Spoken Language Understanding
Viaarxiv icon