Picture for Ivan Medennikov

Ivan Medennikov

Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens

Add code
Sep 10, 2024
Figure 1 for Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens
Figure 2 for Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens
Figure 3 for Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens
Figure 4 for Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens
Viaarxiv icon

Resource-Efficient Adaptation of Speech Foundation Models for Multi-Speaker ASR

Add code
Sep 02, 2024
Viaarxiv icon

NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks

Add code
Aug 23, 2024
Viaarxiv icon

LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring

Add code
Apr 06, 2021
Figure 1 for LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Figure 2 for LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Figure 3 for LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Figure 4 for LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Viaarxiv icon

Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition

Add code
Mar 12, 2021
Figure 1 for Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
Figure 2 for Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
Figure 3 for Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
Figure 4 for Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
Viaarxiv icon

Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset

Add code
Jun 15, 2020
Figure 1 for Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
Figure 2 for Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
Figure 3 for Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
Viaarxiv icon

Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario

Add code
May 14, 2020
Figure 1 for Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Figure 2 for Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Figure 3 for Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Figure 4 for Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Viaarxiv icon

You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation

Add code
May 14, 2020
Figure 1 for You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
Figure 2 for You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
Figure 3 for You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
Figure 4 for You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
Viaarxiv icon

Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription

Add code
Apr 24, 2020
Figure 1 for Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription
Figure 2 for Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription
Figure 3 for Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription
Figure 4 for Towards a Competitive End-to-End Speech Recognition for CHiME-6 Dinner Party Transcription
Viaarxiv icon

Exploring End-to-End Techniques for Low-Resource Speech Recognition

Add code
Jul 02, 2018
Figure 1 for Exploring End-to-End Techniques for Low-Resource Speech Recognition
Figure 2 for Exploring End-to-End Techniques for Low-Resource Speech Recognition
Figure 3 for Exploring End-to-End Techniques for Low-Resource Speech Recognition
Figure 4 for Exploring End-to-End Techniques for Low-Resource Speech Recognition
Viaarxiv icon