Picture for Aleksandr Laptev

Aleksandr Laptev

Fast Context-Biasing for CTC and Transducer ASR models with CTC-based Word Spotter

Add code
Jun 11, 2024
Viaarxiv icon

The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System

Add code
Oct 18, 2023
Viaarxiv icon

Confidence-based Ensembles of End-to-End Speech Recognition Models

Add code
Jun 27, 2023
Viaarxiv icon

Powerful and Extensible WFST Framework for RNN-Transducer Losses

Add code
Mar 18, 2023
Viaarxiv icon

Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition

Add code
Dec 16, 2022
Viaarxiv icon

CTC Variations Through New WFST Topologies

Add code
Oct 06, 2021
Figure 1 for CTC Variations Through New WFST Topologies
Figure 2 for CTC Variations Through New WFST Topologies
Figure 3 for CTC Variations Through New WFST Topologies
Figure 4 for CTC Variations Through New WFST Topologies
Viaarxiv icon

LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring

Add code
Apr 06, 2021
Figure 1 for LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Figure 2 for LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Figure 3 for LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Figure 4 for LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Viaarxiv icon

Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition

Add code
Mar 12, 2021
Figure 1 for Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
Figure 2 for Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
Figure 3 for Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
Figure 4 for Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
Viaarxiv icon

Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset

Add code
Jun 15, 2020
Figure 1 for Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
Figure 2 for Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
Figure 3 for Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
Viaarxiv icon

Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario

Add code
May 14, 2020
Figure 1 for Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Figure 2 for Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Figure 3 for Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Figure 4 for Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Viaarxiv icon