Alert button
Picture for Vitaly Lavrukhin

Vitaly Lavrukhin

Alert button

LibriSpeech-PC: Benchmark for Evaluation of Punctuation and Capitalization Capabilities of end-to-end ASR Models

Oct 04, 2023
Aleksandr Meister, Matvei Novikov, Nikolay Karpov, Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg

Viaarxiv icon

A Chat About Boring Problems: Studying GPT-based text normalization

Sep 23, 2023
Yang Zhang, Travis M. Bartley, Mariana Graterol-Fuenmayor, Vitaly Lavrukhin, Evelina Bakhturina, Boris Ginsburg

Viaarxiv icon

Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio

Aug 09, 2023
Yang Zhang, Krishna C. Puvvada, Vitaly Lavrukhin, Boris Ginsburg

Figure 1 for Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
Figure 2 for Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
Figure 3 for Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
Figure 4 for Conformer-based Target-Speaker Automatic Speech Recognition for Single-Channel Audio
Viaarxiv icon

Confidence-based Ensembles of End-to-End Speech Recognition Models

Jun 27, 2023
Igor Gitman, Vitaly Lavrukhin, Aleksandr Laptev, Boris Ginsburg

Figure 1 for Confidence-based Ensembles of End-to-End Speech Recognition Models
Figure 2 for Confidence-based Ensembles of End-to-End Speech Recognition Models
Figure 3 for Confidence-based Ensembles of End-to-End Speech Recognition Models
Figure 4 for Confidence-based Ensembles of End-to-End Speech Recognition Models
Viaarxiv icon

Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator

Feb 27, 2023
Vladimir Bataev, Roman Korostik, Evgeny Shabalin, Vitaly Lavrukhin, Boris Ginsburg

Figure 1 for Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator
Figure 2 for Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator
Figure 3 for Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator
Figure 4 for Text-only domain adaptation for end-to-end ASR using integrated text-to-mel-spectrogram generator
Viaarxiv icon

Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition

Oct 06, 2022
Somshubra Majumdar, Shantanu Acharya, Vitaly Lavrukhin, Boris Ginsburg

Figure 1 for Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition
Figure 2 for Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition
Figure 3 for Damage Control During Domain Adaptation for Transducer Based Automatic Speech Recognition
Viaarxiv icon

NeMo Toolbox for Speech Dataset Construction

Apr 11, 2021
Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg

Figure 1 for NeMo Toolbox for Speech Dataset Construction
Figure 2 for NeMo Toolbox for Speech Dataset Construction
Figure 3 for NeMo Toolbox for Speech Dataset Construction
Figure 4 for NeMo Toolbox for Speech Dataset Construction
Viaarxiv icon

SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition

Apr 06, 2021
Patrick K. O'Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko

Figure 1 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Figure 2 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Figure 3 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Figure 4 for SPGISpeech: 5,000 hours of transcribed financial audio for fully formatted end-to-end speech recognition
Viaarxiv icon

Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition

Apr 05, 2021
Somshubra Majumdar, Jagadeesh Balam, Oleksii Hrinchuk, Vitaly Lavrukhin, Vahid Noroozi, Boris Ginsburg

Figure 1 for Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition
Figure 2 for Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition
Figure 3 for Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition
Figure 4 for Citrinet: Closing the Gap between Non-Autoregressive and Autoregressive End-to-End Models for Automatic Speech Recognition
Viaarxiv icon

Hi-Fi Multi-Speaker English TTS Dataset

Apr 03, 2021
Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg, Yang Zhang

Figure 1 for Hi-Fi Multi-Speaker English TTS Dataset
Figure 2 for Hi-Fi Multi-Speaker English TTS Dataset
Figure 3 for Hi-Fi Multi-Speaker English TTS Dataset
Figure 4 for Hi-Fi Multi-Speaker English TTS Dataset
Viaarxiv icon