Picture for Jinyu Li

Jinyu Li

Fred

Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?

Add code
Apr 27, 2022
Figure 1 for Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Figure 2 for Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Figure 3 for Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Figure 4 for Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Viaarxiv icon

Large-Scale Streaming End-to-End Speech Translation with Neural Transducers

Add code
Apr 11, 2022
Figure 1 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 2 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 3 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 4 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Viaarxiv icon

Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data

Add code
Mar 31, 2022
Figure 1 for Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Figure 2 for Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Figure 3 for Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Figure 4 for Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Viaarxiv icon

Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings

Add code
Mar 30, 2022
Figure 1 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Figure 2 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Figure 3 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Figure 4 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Viaarxiv icon

Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems

Add code
Mar 02, 2022
Figure 1 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Figure 2 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Figure 3 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Figure 4 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Viaarxiv icon

Streaming Multi-Talker ASR with Token-Level Serialized Output Training

Add code
Feb 05, 2022
Figure 1 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 2 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 3 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 4 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Viaarxiv icon

Endpoint Detection for Streaming End-to-End Multi-talker ASR

Add code
Jan 24, 2022
Figure 1 for Endpoint Detection for Streaming End-to-End Multi-talker ASR
Figure 2 for Endpoint Detection for Streaming End-to-End Multi-talker ASR
Figure 3 for Endpoint Detection for Streaming End-to-End Multi-talker ASR
Figure 4 for Endpoint Detection for Streaming End-to-End Multi-talker ASR
Viaarxiv icon

Self-Supervised Learning for speech recognition with Intermediate layer supervision

Add code
Dec 16, 2021
Figure 1 for Self-Supervised Learning for speech recognition with Intermediate layer supervision
Figure 2 for Self-Supervised Learning for speech recognition with Intermediate layer supervision
Figure 3 for Self-Supervised Learning for speech recognition with Intermediate layer supervision
Figure 4 for Self-Supervised Learning for speech recognition with Intermediate layer supervision
Viaarxiv icon

Sequence-level self-learning with multiple hypotheses

Add code
Dec 10, 2021
Figure 1 for Sequence-level self-learning with multiple hypotheses
Figure 2 for Sequence-level self-learning with multiple hypotheses
Figure 3 for Sequence-level self-learning with multiple hypotheses
Figure 4 for Sequence-level self-learning with multiple hypotheses
Viaarxiv icon

Separating Long-Form Speech with Group-Wise Permutation Invariant Training

Add code
Nov 17, 2021
Figure 1 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 2 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 3 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 4 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Viaarxiv icon