Picture for Jinyu Li

Jinyu Li

Beijing Institute of Technology, China

Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training

Add code
Jun 21, 2022
Figure 1 for Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training
Figure 2 for Supervision-Guided Codebooks for Masked Prediction in Speech Pre-training
Viaarxiv icon

The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task

Add code
Jun 14, 2022
Figure 1 for The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task
Figure 2 for The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task
Figure 3 for The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task
Figure 4 for The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task
Viaarxiv icon

Ultra Fast Speech Separation Model with Teacher Student Learning

Add code
Apr 27, 2022
Figure 1 for Ultra Fast Speech Separation Model with Teacher Student Learning
Figure 2 for Ultra Fast Speech Separation Model with Teacher Student Learning
Figure 3 for Ultra Fast Speech Separation Model with Teacher Student Learning
Viaarxiv icon

Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?

Add code
Apr 27, 2022
Figure 1 for Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Figure 2 for Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Figure 3 for Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Figure 4 for Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Viaarxiv icon

Large-Scale Streaming End-to-End Speech Translation with Neural Transducers

Add code
Apr 11, 2022
Figure 1 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 2 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 3 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Figure 4 for Large-Scale Streaming End-to-End Speech Translation with Neural Transducers
Viaarxiv icon

Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data

Add code
Mar 31, 2022
Figure 1 for Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Figure 2 for Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Figure 3 for Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Figure 4 for Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data
Viaarxiv icon

Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings

Add code
Mar 30, 2022
Figure 1 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Figure 2 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Figure 3 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Figure 4 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Viaarxiv icon

Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems

Add code
Mar 02, 2022
Figure 1 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Figure 2 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Figure 3 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Figure 4 for Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems
Viaarxiv icon

Streaming Multi-Talker ASR with Token-Level Serialized Output Training

Add code
Feb 05, 2022
Figure 1 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 2 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 3 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 4 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Viaarxiv icon

Endpoint Detection for Streaming End-to-End Multi-talker ASR

Add code
Jan 24, 2022
Figure 1 for Endpoint Detection for Streaming End-to-End Multi-talker ASR
Figure 2 for Endpoint Detection for Streaming End-to-End Multi-talker ASR
Figure 3 for Endpoint Detection for Streaming End-to-End Multi-talker ASR
Figure 4 for Endpoint Detection for Streaming End-to-End Multi-talker ASR
Viaarxiv icon