Alert button
Picture for Aleksandr Laptev

Aleksandr Laptev

Alert button

The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System

Add code
Bookmark button
Alert button
Oct 18, 2023
Tae Jin Park, He Huang, Ante Jukic, Kunal Dhawan, Krishna C. Puvvada, Nithin Koluguri, Nikolay Karpov, Aleksandr Laptev, Jagadeesh Balam, Boris Ginsburg

Viaarxiv icon

Confidence-based Ensembles of End-to-End Speech Recognition Models

Add code
Bookmark button
Alert button
Jun 27, 2023
Igor Gitman, Vitaly Lavrukhin, Aleksandr Laptev, Boris Ginsburg

Figure 1 for Confidence-based Ensembles of End-to-End Speech Recognition Models
Figure 2 for Confidence-based Ensembles of End-to-End Speech Recognition Models
Figure 3 for Confidence-based Ensembles of End-to-End Speech Recognition Models
Figure 4 for Confidence-based Ensembles of End-to-End Speech Recognition Models
Viaarxiv icon

Powerful and Extensible WFST Framework for RNN-Transducer Losses

Add code
Bookmark button
Alert button
Mar 18, 2023
Aleksandr Laptev, Vladimir Bataev, Igor Gitman, Boris Ginsburg

Figure 1 for Powerful and Extensible WFST Framework for RNN-Transducer Losses
Figure 2 for Powerful and Extensible WFST Framework for RNN-Transducer Losses
Figure 3 for Powerful and Extensible WFST Framework for RNN-Transducer Losses
Figure 4 for Powerful and Extensible WFST Framework for RNN-Transducer Losses
Viaarxiv icon

Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition

Add code
Bookmark button
Alert button
Dec 16, 2022
Aleksandr Laptev, Boris Ginsburg

Figure 1 for Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition
Figure 2 for Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition
Figure 3 for Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition
Figure 4 for Fast Entropy-Based Methods of Word-Level Confidence Estimation for End-To-End Automatic Speech Recognition
Viaarxiv icon

CTC Variations Through New WFST Topologies

Add code
Bookmark button
Alert button
Oct 06, 2021
Aleksandr Laptev, Somshubra Majumdar, Boris Ginsburg

Figure 1 for CTC Variations Through New WFST Topologies
Figure 2 for CTC Variations Through New WFST Topologies
Figure 3 for CTC Variations Through New WFST Topologies
Figure 4 for CTC Variations Through New WFST Topologies
Viaarxiv icon

LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring

Add code
Bookmark button
Alert button
Apr 06, 2021
Anton Mitrofanov, Mariya Korenevskaya, Ivan Podluzhny, Yuri Khokhlov, Aleksandr Laptev, Andrei Andrusenko, Aleksei Ilin, Maxim Korenevsky, Ivan Medennikov, Aleksei Romanenko

Figure 1 for LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Figure 2 for LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Figure 3 for LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Figure 4 for LT-LM: a novel non-autoregressive language model for single-shot lattice rescoring
Viaarxiv icon

Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Mar 12, 2021
Aleksandr Laptev, Andrei Andrusenko, Ivan Podluzhny, Anton Mitrofanov, Ivan Medennikov, Yuri Matveev

Figure 1 for Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
Figure 2 for Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
Figure 3 for Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
Figure 4 for Dynamic Acoustic Unit Augmentation With BPE-Dropout for Low-Resource End-to-End Speech Recognition
Viaarxiv icon

Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset

Add code
Bookmark button
Alert button
Jun 15, 2020
Andrei Andrusenko, Aleksandr Laptev, Ivan Medennikov

Figure 1 for Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
Figure 2 for Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
Figure 3 for Exploration of End-to-End ASR for OpenSTT -- Russian Open Speech-to-Text Dataset
Viaarxiv icon

Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario

Add code
Bookmark button
Alert button
May 14, 2020
Ivan Medennikov, Maxim Korenevsky, Tatiana Prisyach, Yuri Khokhlov, Mariya Korenevskaya, Ivan Sorokin, Tatiana Timofeeva, Anton Mitrofanov, Andrei Andrusenko, Ivan Podluzhny, Aleksandr Laptev, Aleksei Romanenko

Figure 1 for Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Figure 2 for Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Figure 3 for Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Figure 4 for Target-Speaker Voice Activity Detection: a Novel Approach for Multi-Speaker Diarization in a Dinner Party Scenario
Viaarxiv icon

You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation

Add code
Bookmark button
Alert button
May 14, 2020
Aleksandr Laptev, Roman Korostik, Aleksey Svischev, Andrei Andrusenko, Ivan Medennikov, Sergey Rybin

Figure 1 for You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
Figure 2 for You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
Figure 3 for You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
Figure 4 for You Do Not Need More Data: Improving End-To-End Speech Recognition by Text-To-Speech Data Augmentation
Viaarxiv icon