Picture for Rohit Prabhavalkar

Rohit Prabhavalkar

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition

Add code
Apr 06, 2021
Figure 1 for Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Figure 2 for Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Figure 3 for Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Figure 4 for Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Viaarxiv icon

Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency

Add code
Apr 05, 2021
Figure 1 for Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency
Figure 2 for Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency
Figure 3 for Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency
Figure 4 for Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency
Viaarxiv icon

Learning Word-Level Confidence For Subword End-to-End ASR

Add code
Mar 11, 2021
Figure 1 for Learning Word-Level Confidence For Subword End-to-End ASR
Figure 2 for Learning Word-Level Confidence For Subword End-to-End ASR
Figure 3 for Learning Word-Level Confidence For Subword End-to-End ASR
Figure 4 for Learning Word-Level Confidence For Subword End-to-End ASR
Viaarxiv icon

Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging

Add code
Dec 12, 2020
Figure 1 for Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging
Figure 2 for Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging
Figure 3 for Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging
Figure 4 for Less Is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging
Viaarxiv icon

Cascaded encoders for unifying streaming and non-streaming ASR

Add code
Oct 27, 2020
Figure 1 for Cascaded encoders for unifying streaming and non-streaming ASR
Figure 2 for Cascaded encoders for unifying streaming and non-streaming ASR
Figure 3 for Cascaded encoders for unifying streaming and non-streaming ASR
Figure 4 for Cascaded encoders for unifying streaming and non-streaming ASR
Viaarxiv icon

Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction

Add code
Oct 20, 2020
Figure 1 for Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction
Figure 2 for Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction
Figure 3 for Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction
Figure 4 for Replacing Human Audio with Synthetic Audio for On-device Unspoken Punctuation Prediction
Viaarxiv icon

RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions

Add code
May 17, 2020
Figure 1 for RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Figure 2 for RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Figure 3 for RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Figure 4 for RNN-T Models Fail to Generalize to Out-of-Domain Audio: Causes and Solutions
Viaarxiv icon

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

Add code
Mar 28, 2020
Figure 1 for A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Figure 2 for A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Figure 3 for A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Figure 4 for A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency
Viaarxiv icon

Deliberation Model Based Two-Pass End-to-End Speech Recognition

Add code
Mar 17, 2020
Figure 1 for Deliberation Model Based Two-Pass End-to-End Speech Recognition
Figure 2 for Deliberation Model Based Two-Pass End-to-End Speech Recognition
Figure 3 for Deliberation Model Based Two-Pass End-to-End Speech Recognition
Figure 4 for Deliberation Model Based Two-Pass End-to-End Speech Recognition
Viaarxiv icon

A comparison of end-to-end models for long-form speech recognition

Add code
Nov 06, 2019
Figure 1 for A comparison of end-to-end models for long-form speech recognition
Figure 2 for A comparison of end-to-end models for long-form speech recognition
Figure 3 for A comparison of end-to-end models for long-form speech recognition
Viaarxiv icon