Picture for Chunyang Wu

Chunyang Wu

Jack

Prompting Large Language Models with Speech Recognition Abilities

Add code
Jul 21, 2023
Figure 1 for Prompting Large Language Models with Speech Recognition Abilities
Figure 2 for Prompting Large Language Models with Speech Recognition Abilities
Figure 3 for Prompting Large Language Models with Speech Recognition Abilities
Figure 4 for Prompting Large Language Models with Speech Recognition Abilities
Viaarxiv icon

Towards Selection of Text-to-speech Data to Augment ASR Training

Add code
May 30, 2023
Figure 1 for Towards Selection of Text-to-speech Data to Augment ASR Training
Figure 2 for Towards Selection of Text-to-speech Data to Augment ASR Training
Figure 3 for Towards Selection of Text-to-speech Data to Augment ASR Training
Figure 4 for Towards Selection of Text-to-speech Data to Augment ASR Training
Viaarxiv icon

Multi-Head State Space Model for Speech Recognition

Add code
May 25, 2023
Figure 1 for Multi-Head State Space Model for Speech Recognition
Figure 2 for Multi-Head State Space Model for Speech Recognition
Figure 3 for Multi-Head State Space Model for Speech Recognition
Figure 4 for Multi-Head State Space Model for Speech Recognition
Viaarxiv icon

Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution

Add code
Oct 07, 2021
Figure 1 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Figure 2 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Figure 3 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Figure 4 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Viaarxiv icon

Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios

Add code
Apr 06, 2021
Figure 1 for Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios
Figure 2 for Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios
Figure 3 for Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios
Figure 4 for Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios
Viaarxiv icon

Dissecting User-Perceived Latency of On-Device E2E Speech Recognition

Add code
Apr 06, 2021
Figure 1 for Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Figure 2 for Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Figure 3 for Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Figure 4 for Dissecting User-Perceived Latency of On-Device E2E Speech Recognition
Viaarxiv icon

Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency

Add code
Apr 05, 2021
Figure 1 for Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency
Figure 2 for Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency
Figure 3 for Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency
Figure 4 for Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency
Viaarxiv icon

Streaming Attention-Based Models with Augmented Memory for End-to-End Speech Recognition

Add code
Nov 03, 2020
Figure 1 for Streaming Attention-Based Models with Augmented Memory for End-to-End Speech Recognition
Figure 2 for Streaming Attention-Based Models with Augmented Memory for End-to-End Speech Recognition
Figure 3 for Streaming Attention-Based Models with Augmented Memory for End-to-End Speech Recognition
Figure 4 for Streaming Attention-Based Models with Augmented Memory for End-to-End Speech Recognition
Viaarxiv icon

Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications

Add code
Oct 29, 2020
Figure 1 for Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications
Figure 2 for Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications
Figure 3 for Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications
Figure 4 for Transformer in action: a comparative study of transformer-based acoustic models for large scale speech recognition applications
Viaarxiv icon

Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition

Add code
Oct 29, 2020
Figure 1 for Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Figure 2 for Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Figure 3 for Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Figure 4 for Emformer: Efficient Memory Transformer Based Acoustic Model For Low Latency Streaming Speech Recognition
Viaarxiv icon