Picture for Jinyu Li

Jinyu Li

Fred

Recent Advances in End-to-End Automatic Speech Recognition

Add code
Nov 02, 2021
Figure 1 for Recent Advances in End-to-End Automatic Speech Recognition
Figure 2 for Recent Advances in End-to-End Automatic Speech Recognition
Figure 3 for Recent Advances in End-to-End Automatic Speech Recognition
Figure 4 for Recent Advances in End-to-End Automatic Speech Recognition
Viaarxiv icon

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

Add code
Oct 29, 2021
Figure 1 for WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Figure 2 for WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Figure 3 for WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Figure 4 for WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Viaarxiv icon

Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction

Add code
Oct 28, 2021
Figure 1 for Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
Figure 2 for Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
Figure 3 for Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
Figure 4 for Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction
Viaarxiv icon

Continuous Speech Separation with Recurrent Selective Attention Network

Add code
Oct 28, 2021
Figure 1 for Continuous Speech Separation with Recurrent Selective Attention Network
Figure 2 for Continuous Speech Separation with Recurrent Selective Attention Network
Figure 3 for Continuous Speech Separation with Recurrent Selective Attention Network
Figure 4 for Continuous Speech Separation with Recurrent Selective Attention Network
Viaarxiv icon

Factorized Neural Transducer for Efficient Language Model Adaptation

Add code
Oct 18, 2021
Figure 1 for Factorized Neural Transducer for Efficient Language Model Adaptation
Figure 2 for Factorized Neural Transducer for Efficient Language Model Adaptation
Figure 3 for Factorized Neural Transducer for Efficient Language Model Adaptation
Figure 4 for Factorized Neural Transducer for Efficient Language Model Adaptation
Viaarxiv icon

Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition

Add code
Oct 14, 2021
Figure 1 for Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
Figure 2 for Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition
Viaarxiv icon

SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing

Add code
Oct 14, 2021
Figure 1 for SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing
Figure 2 for SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing
Figure 3 for SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing
Figure 4 for SpeechT5: Unified-Modal Encoder-Decoder Pre-training for Spoken Language Processing
Viaarxiv icon

UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training

Add code
Oct 12, 2021
Figure 1 for UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training
Figure 2 for UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training
Figure 3 for UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training
Figure 4 for UniSpeech-SAT: Universal Speech Representation Learning with Speaker Aware Pre-Training
Viaarxiv icon

Wav2vec-Switch: Contrastive Learning from Original-noisy Speech Pairs for Robust Speech Recognition

Add code
Oct 11, 2021
Figure 1 for Wav2vec-Switch: Contrastive Learning from Original-noisy Speech Pairs for Robust Speech Recognition
Figure 2 for Wav2vec-Switch: Contrastive Learning from Original-noisy Speech Pairs for Robust Speech Recognition
Figure 3 for Wav2vec-Switch: Contrastive Learning from Original-noisy Speech Pairs for Robust Speech Recognition
Figure 4 for Wav2vec-Switch: Contrastive Learning from Original-noisy Speech Pairs for Robust Speech Recognition
Viaarxiv icon

Have best of both worlds: two-pass hybrid and E2E cascading framework for speech recognition

Add code
Oct 10, 2021
Figure 1 for Have best of both worlds: two-pass hybrid and E2E cascading framework for speech recognition
Figure 2 for Have best of both worlds: two-pass hybrid and E2E cascading framework for speech recognition
Figure 3 for Have best of both worlds: two-pass hybrid and E2E cascading framework for speech recognition
Figure 4 for Have best of both worlds: two-pass hybrid and E2E cascading framework for speech recognition
Viaarxiv icon